Issues of Equity and Adequacy
The No Child Left Behind Act (NCLB) articulates clear goals for equity in science education and compels states to use data about student achievement to identify any areas in which they may be falling short. By requiring states to disaggregate assessment results for major subgroups and by holding states accountable for achievement across all groups, NCLB makes clear that all students must be given an equitable opportunity to develop science literacy. The law places a premium on the challenge of including all students in assessments, and it highlights the challenges of ensuring that all students’ science learning is supported by adequate resources. This chapter explores each of these issues.
OPPORTUNITY TO LEARN
Excellence in science education embodies the idea that all students can achieve science literacy if they are given the opportunity to learn (American Association for the Advancement of Science, 1989; National Research Council, 1996). Although students will achieve understanding of science concepts in different ways and at different depths and at different rates of progress, opportunity to learn implies that all should have the chance to develop the understandings associated with science literacy to the maximum extent possible. The National Science Education Standards (National Research Council, 1996) and Science for All Americans (American Association for the Advancement of Science, 1989) make this goal a priority, especially for students who historically have not received adequate encouragement and opportunity to pursue science—women, students of color, students with disabilities, and students with limited English language proficiency,
for example. The authors of the National Science Education Standards made clear their commitment to this goal by advocating that the collection of data about students’ opportunity to learn should be included in a science assessment program. NCLB reflects this goal and mandates the interpretation of test-based information in ways that may highlight discrepancies in opportunity to learn among different groups of students, schools, and school districts in a state.
Science education poses particular challenges in meeting the goal of opportunity to learn. Of primary concern is the scarcity of highly qualified science teachers (see for example, National Commission on Mathematics and Science Teaching for the 21st Century, 2000). While NCLB requires that every child have access to highly qualified teachers, there may not be sufficient numbers of these teachers to staff all science classrooms. This is particularly true in rural and urban settings and in the elementary and middle grades, where many teachers are generalists, rather than science specialists. The Council of Chief State School Officers and the National Center for Education Statistics have collected detailed information on the staffing patterns in different schools and subjects that support the committee’s observation. (This information is available on the organizations’ web sites.)
The fairness of assessments and the validity of interpretations of their results depend on the extent to which students have had sufficient opportunities to learn the knowledge and abilities that are being assessed. Without this information, it is impossible to know whether the results shed light on aspects of the curriculum, instructional strategies, or students’ efforts or abilities, or whether they simply indicate that students have not had the chance to learn what has been assessed.
It is particularly important that in interpreting test results, states consider the extent to which students with disabilities and English language learners have had an opportunity to learn the material covered by a science assessment, because instruction in special programs may focus on reading and mathematics rather than science. When students are tested on material that they have not had an opportunity to learn, the test results cannot be interpreted as meaning the same thing as for students who have received instruction in the area.
States have a number of sources of evidence they can use to answer questions about students’ opportunity to learn. Collateral information about individual students or groups of students is particularly important when the stakes for individual students are high, as when assessments are used for promotion and graduation, for example. This information can be obtained through questionnaires that ask, for example, whether students were provided with curriculum, instruction, and resources, or whether educators, students, and parents were informed before an assessment was conducted about the knowledge, skills, and abilities that were to be assessed (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999). Research has suggested that the primary areas that should be considered when examining opportunities to learn are curriculum content, instructional strategies, and instructional resources (Brewer and Stacz, 1996).
Inequities also can exist at a broader level. Differences in performance across groups (e.g., gender, ethnic, or geographic groups) can be confounded with differences in access to curriculum, instruction, and resources. Performance differences from school to school may be confounded with differences in the quality of education, such as the number of advanced course offerings and the quality of educators (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999). When assessments have high stakes for teachers, inequities with regard to teacher quality may increase; for example, teachers of high quality may choose not to teach in low-performing schools because of the possibility that negative consequences associated with low school performance will affect their careers. Thus, students in these schools, who are typically poor or are members of other subgroups that have been disadvantaged in the past, would not have equal access to high-quality teachers.
In this report we have described assessment strategies that ask students to create responses, rather than choose among a defined set of options. While performance assessments can capture a broad range of complex thinking and problem-solving skills, they are useful only when instruction has provided opportunities for students to be engaged in the kind of skills that are targeted by the assessment. Similarly, assessments that require students to use laboratory materials or other hands-on materials are useful only when students have used comparable materials in the classroom. If innovative item formats are used in the assessment, they should be related to instruction that has provided students with the opportunity to engage in problem solving with these formats. Thus, information about the nature of the instructional program in which each student has been enrolled is an important part of understanding assessment results.
INCLUDING ALL STUDENTS
NCLB requires that all students be included in assessments and that accommodations be offered to students with disabilities and to English language learners as appropriate. States are permitted to provide alternate assessments for students who cannot participate for a variety of reasons in exactly the same assessment as other students. These alternate assessments are either aligned with the same standards as the regular assessments, or, for students who cannot be held to the same standards as other students because of severe cognitive disabilities, are based on alternate achievement standards (U.S. Department of Education, 2004, p. 50).
Issues Related to Accommodations
The challenge for states is complex. Although states are required to provide appropriate accommodations to these two groups of students, the effects of ac-
commodations on test performance and on the inferences based on test results are not clearly understood. Findings from research are not conclusive with regard to the comparability of inferences based on scores obtained under accommodated and nonaccommodated conditions (National Research Council, 2004). Nevertheless, states are expected to include test results for students with disabilities and English language learners in their aggregated reports and to report disaggregated group results for these students, and they may be held accountable for demonstrating that these students are making progress in science.
A principle known as universal design, which was developed by architects and other designers, has been adapted to educational measurement and holds some promise for ameliorating some of the difficulties that testing accommodations present. The principle is that products and buildings—or assessments—should be designed so that the greatest number of people can use them without the need for modification—that is, to eliminate unnecessary obstacles to access. In the case of assessment this might mean, for example, that if all students had more than enough time to complete an assessment task, offering extra time to students who need it because of cognitive disabilities would not provide them with an unfair advantage over other students. The application of universal design principles to assessment has not, however, been fully developed; the committee hopes that with further research it will provide valuable alternatives for states. Further information on this topic is available from the National Center on Educational Outcomes (http://education.umn.edu/NCEO).
Advice to States
Although the research base on the effects of accommodations on the interpretation of test scores and the inferences that can be supported by results is inconclusive, some guidance can be offered for those making decisions about test development and the provision of accommodations. First, states and their test developers should make clear which inferences are to be based on test results. Clear specification of the target skills evaluated and of the ancillary skills required to demonstrate proficiency on the target skills can improve decision making about accommodations. For example, in a written science assessment with open-ended responses, is writing a target skill or an ancillary skill? Is the assessment designed to make inferences about science knowledge, about written expression of science knowledge, or about written expression of science knowledge in English? The answers to these questions can assist with decisions about accommodations, such as whether to provide a scribe to write answers or to provide a translator to translate answers into English. If mathematics is required to complete the assessment tasks, is mathematics computation a target skill or an ancillary skill? Is the desired inference about knowing the correct equation to use or about performing the calculations? (Here, the answers can guide decisions about use of a calculator.) Further discussion about identifying target and ancillary skills and about
articulating the intended inferences can be found in Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments (National Research Council, 2004, Chapter 6).
Second, states should consider the needs of students with disabilities and English language learners when designing their assessments and making decisions about such issues as time limits, wording of test items, and response formats. For example, one of the most common accommodations is the provision of extra time to complete an assessment. Research has shown that general education students, as well as students with disabilities and English language learners, perform better when time limits are lifted (Zuriff, 2000; Abedi, Hofstetter, Baker, and Lord, 2001; Elliott, Kratochwill, and McKevitt, 2001), which suggests that the time limits set for some assessments may be too stringent. Careful consideration of the amount of time required to complete a test (or whether time limits are needed at all) may reduce the need for extended time accommodations.
Another example is a common accommodation often provided to English language learners referred to as “simplified language,” “modified language,” or “plain language.” This accommodation is intended to reduce the reading level to increase the accessibility of an assessment to a nonnative English speaker. Research has shown that this accommodation helps both English language learners and native English speakers (Abedi, Lord, and Hofstetter, 1998), which suggests that some assessments may use unnecessarily difficult vocabulary. The need to provide this accommodation can be reduced by careful attention to reading level and vocabulary requirements in assessment tasks. Bias and sensitivity reviews should be conducted during the development of assessment items, and reviewers should include individuals with expertise in working with students with disabilities and English language learners who can identify language, noncontent vocabulary, and terminology that causes assessment tasks to be more difficult than intended.
Third, states and their test developers should include samples of students with disabilities and English language learners during the field testing of assessment tasks. Field testing provides critical information about the performance of assessment tasks, and inclusion of students from these groups will help identify problems during the earliest stages of test development.
Many of the measures described in this section were originally devised in the context of traditional assessments; the principles apply to any kind of assessment, although complex questions may arise. The goal of very clearly identifying the construct to be measured and making sure that the assessment does not pose significant challenges that are irrelevant to the intended construct is worthwhile in any assessment context. However, one could choose, for example, to assess students’ capacity to conduct a sustained investigation using a set of related standardized tasks to be completed over a period of weeks specifically because it is a way of measuring a complex, multifaceted construct. To complete this assessment, a student may need to be able to document findings with both clear
narrative and numerical records, to manipulate equipment, to visually discern subtle changes, and perhaps to collaborate with other students. These are examples of tasks that may pose a particular challenge for some students, and task developers and educators face a challenge in determining which are integral to the construct, and how students with disabilities or English language learners might be accommodated.
NCLB will focus increased attention on science education in the United States. Indeed, the incremental increase in attention paid to science education is likely to exceed the increase associated with reading and mathematics when annual testing in these subjects became mandatory. While reading and mathematics have been routinely assessed by states for a number of years—decades, in some states—NCLB marks a significantly increased focus on measuring and reporting on science achievement.
The central goal of the assessment component of NCLB is to highlight the areas in which students are not performing at a sufficiently high level and to focus attention on the schools and subjects in which performance targets are not being met. The revelations about inadequacies in science education that are likely to result will have a variety of important implications for schools and states. For example, the increased scrutiny of science education in a state may alter the labor market for teachers, potentially changing the ability of school districts to meet the requirement of NCLB for highly qualified teachers.
The measurement and reporting of science proficiency is likely to lead to an increased focus on science instruction. This reporting also will help to reveal the degree to which schools have supported science education in the past. Numerous authors, including Figlio and Rouse (2004) and Jacob (2003), have indicated that schools tend to focus more attention on the subjects in which performance is measured, particularly when high stakes are attached to results. Therefore, the inclusion of science in an assessment system (and possibly in an accountability system) could lead to a relative increase in the instructional time and staffing devoted to science. This response may be particularly great in schools serving underserved populations, because these schools are the most likely to focus their attention on the high-stakes subjects.
Highlighting Existing Equity and Adequacy Issues
A widespread finding of low science proficiency could indicate that students in a state have not received an adequate level of instruction in science. Such a revelation could have important implications for school finance, as 21 state constitutions have explicit language requiring that states provide for “adequate” levels of school funding. Adequacy standards define a target level of achievement in
core subject areas that schools are expected to reach. The minimum adequate level of spending necessary to enable students to attain target achievement levels dictates the basic level of foundation-grant state aid programs, which is by far the most common form of state aid in the United States. A finding of widespread low performance on science assessments could raise constitutional concerns, because definitions of adequate school funding have historically been based on reading and mathematics performance levels, not science ones. Therefore, disappointing science assessment results could indicate that a higher level of school spending is necessary to ensure that the resources allotted for science education are adequate.
Whether such a finding would necessarily raise constitutional concerns is a matter for debate, however. While some interpretations of state adequacy provisions are based on the realized levels of student test performance, a more common interpretation is that schools receive adequate funding if students have adequate opportunity to learn, regardless of students’ actual performance levels. Depending on a state’s interpretation of its adequacy provisions, an argument that schools could reach proficiency targets in science with existing resources may be supportable.
A related point is equity. A finding of a wide disparity across schools in the rates at which students meet proficiency standards could raise concerns about whether the distribution of school resources across school districts in a state is equitable. The constitution of 20 states supports the notion that school finance in a state must be equitable across school districts. There are two conceptions of equity in school finance: equity for school-age children and equity for taxpayers. In this context, taxpayer equity means that any given tax rate would relate to the same level of per-pupil spending independent of the taxpayer’s residential location. The introduction of science testing in a school assessment or accountability system would raise taxpayer equity concerns only if this definition were expanded to imply that any given tax rate would relate to the same level of education services, independent of the taxpayer’s residential location. If the assessment system helped reveal that the level of science instruction differs across school districts in a state, a state may need to contend with new equity considerations, even if it considers only taxpayer equity.
Children’s equity, by contrast, encompasses such questions as whether different groups of children receive similar levels of services. Few interpret state constitutions as mandating equality of outcomes; rather, most interpret equity provisions as requiring equality of opportunity. An equity interpretation of equality of outcomes would suggest that the consequences of testing students in a new subject area are potentially large. However, even the more common conception of equality of opportunity could raise equity concerns if it were found that it is more expensive to produce a certain level of student science achievement in some settings than in others.
In both the equity and adequacy cases, it is not obvious that new science assessments would lead to recommendations for increased science funding. After
all, schools’ observed science staffing and materials reflect both financial realities and choices made by schools. However, decisions regarding school funding adequacy and equity have been based largely on reading and mathematics performance. The new information provided by science assessments could change the calculus of school finance in many states. Schools already severely constrained fiscally when they were not placing a heavy emphasis on science instruction could become considerably more constrained if they were compelled to shift resources toward science education.
Exacerbating Existing Inequities in School Finance
Beyond simply highlighting the present level of inequities in school finance, new science assessments—especially if incorporated into an accountability system—could exacerbate these inequities through effects on the labor market for science teachers. Clotfelter, Ladd, and Vigdor (2004) and Figlio and Rueben (2001) argue that teachers are responsive to test-based or fiscal accountability systems; also, schools serving lower income or minority students are most likely to have a difficult time retaining high-quality teachers. Therefore, schools with underserved populations—already affected by financing inequities—are likely to be further affected by a disproportionate flight of high-quality teachers.
The rationale behind this argument stems from economic theory. The increased attention paid to science education is likely to increase the demand for qualified science teachers—a tendency reinforced by the “highly qualified teacher” provision of NCLB. At the same time, research has demonstrated that increased accountability pressures decrease the number of teachers—in this case, science teachers—willing to work at any given salary. The result of these two forces is that the market-clearing salary for a science teacher at the current level of quality would necessarily increase as a result of increased accountability pressures. In the absence of salary increases, the consequence of these changes in market forces would be an average lowering of science teacher quality.
The burden of this reduced average level of teacher quality is likely to be borne primarily, if not exclusively, by schools and districts serving minority and economically disadvantaged students. Teachers in schools serving more advantaged populations could be expected to face lower accountability pressure, and these teachers would be less likely to leave science teaching as a result of new science assessments. Hence, the predicted outflow of quality science teachers should be lower in more advantaged schools and districts. Moreover, if these schools are more likely to hire the highly qualified science teachers leaving the less advantaged schools (Clotfelter, Ladd, and Vigdor, 2002), they are also more likely to have provided significant science education to their students prior to the assessment system. Advantaged schools and districts would therefore be expected to experience a smaller increase in the demand for improvement in science education as a result of new assessments. These schools also could be predicted to face
a small average reduction in teacher quality at any given salary level. Schools serving more disadvantaged populations, in contrast, are expected to experience larger outflows of qualified science teachers at the same time that their demand for these same teachers is increasing; these schools therefore would sustain larger reductions in average science teacher quality at any given salary level.
The implication of these findings is that teacher salaries are likely to need to increase under heightened accountability conditions in order to maintain—let alone increase—the level of science teacher quality in a state. Moreover, one could expect schools serving minority and low-income populations to need the greatest increase, a situation that would probably exacerbate existing inequities. The increased costs of providing an adequate level of science education, coupled with the likelihood that these increased costs will be borne unequally by schools and districts, suggest that the introduction of science assessments—particularly if high stakes are attached—will raise new equity and adequacy issues in education finance.
The testing provision of NCLB surely will lead to a keener awareness of the state of science education in public schools. Moreover, an increased focus on science assessment is very likely to highlight new school finance issues.
In advance of the implementation of science assessment, states should consider the likely school finance implications—in terms of equity and adequacy—and begin to plan for them. It is not a criticism of NCLB or of science assessment to argue that these assessments are likely to have large school finance implications. Rather, it is important that the school finance system be sufficiently flexible so that states can respond rapidly to new school finance-related issues that are uncovered through the assessment. These ramifications may involve increasing the state’s contribution to local education budgets, or they may involve adjusting state aid formulas. With advance warning, states will be better able to cope with these eventualities successfully.
This committee advocates that state science assessments be closely aligned with a set of rigorous, well-defined, and high-quality standards that stress scientific inquiry. The more closely an assessment fully captures these standards, however, the more likely it is to expose existing inadequacies or inequities in the current school finance system. States should be aware that the more closely their assessments are aligned with their standards in the design of science assessments, the more pronounced the potential implications for school finance may be.
The market for high-quality science teachers may change as a result of the introduction of science assessments, and states should be prepared to help increase the incentives for high-quality teachers to remain in the profession and in their schools, following the assessment’s introduction. States have many policy options at their disposal for helping to ensure that all students have access to high-quality science teachers. Possible options include targeted bonuses for qualified science teachers to teach or remain in schools serving underserved student populations.
QUESTIONS FOR STATES
Opportunity to Learn
Question 7-1: Is the state’s science assessment system constructed to provide information on students’ opportunity to learn what is needed to meet the state’s goals for science learning? Does the state continually monitor and periodically evaluate its education system to ensure that sufficient opportunity to learn is being maintained for all students?
Including All Students
Question 7-2: Are all components of the state assessment system designed to make them accessible to the widest range of students and to support valid interpretations about their performance? Does the development process for each component include consideration of ways to minimize challenges unrelated to the construct being measured?
Question 7-3: Does the state’s science assessment system include alternative assessments that can be used to assess the science achievement of students with significant cognitive disabilities?
Question 7-4: Has the state set aside resources for making improvements in its science education system to remedy the inequities or inadequacies that may be revealed by assessment and evaluation data? Has it also set aside resources to promulgate exemplary practices that may be revealed by assessment results?
Question 7-5: Does the state monitor the assessment system’s effect on the recruitment and retention of high-quality teachers?