Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Summary
I
n recent years, there have been increasing efforts by the federal gov-
ernment and the states to devise systems that make students, teach -
ers, principals, or whole school systems accountable for how much
students learn. Large-scale tests are usually a key component of such sys-
tems. The No Child Left Behind (NCLB) Act of 2001 and the widespread
use of high school exit exams in many states are two examples of a trend
that has been going on for several decades.
The Committee on Incentives and Test-Based Accountability in Public
Education was established by the National Research Council to review
and synthesize research about how incentives affect behavior and to
consider the implications of that research for educational accountability
systems that attach incentives to test results. The committee focused on
research about incentives in which an explicit consequence is attached
to a measure of performance, starting first with basic research from the
social and behavioral sciences and then turning to applied research in
education.
BASIC RESEARCH ABOUT INCENTIVES
In reviewing basic research from the behavioral and social sciences
about how incentives operate, the committee focused on theoretical
research from economics and experimental research from psychology.
Together, these two literatures show the way that subtle differences in
the structure of incentives can be crucial in determining their effect. The
1
OCR for page 2
2 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
research review points to five key choices that should be considered in
designing incentive systems:
1. Who is targeted by the incentives: In complex organizations, incen-
tives can be designed for people in different positions who can
affect outcomes in different ways.
2. What performance measures are used: The performance measures to
which incentives are attached must be aligned with the desired
outcomes for the incentives to have their desired effect.
3. What consequences are used: The size and structure of the conse-
quences provided by the incentives will affect how the incentives
operate and should be designed to be appropriate to the situation.
4. What support is provided: Without resources in support of orga-
nizational objectives, incentives can be discouraging to the very
people they are intended to help, particularly if those people lack
the capacity to reach the target that provides a reward or avoids
a sanction.
5. How incentives are framed and communicated: To be effective incen-
tives need to be framed and communicated in ways that reinforce
people’s commitment to the goal that incentives have been put in
place to achieve, rather than in ways that erode that commitment.
The committee’s research review also identified three issues related
to evaluating the success of incentive systems:
1. Nonincentivized performance measures for evaluation: Incentives will
often lead people to find ways to increase measured performance
that do not also improve the desired outcomes. As a result, differ-
ent performance measures—that are not being used in the incen-
tives system—should be used when evaluating how the incentives
are working.
2. Changes in dispositions: In addition to evaluating the changes in a
set of defined objective outcomes, it is important to consider the
way incentive systems affect people’s dispositions to act when
they are not being directly affected by the incentives.
3. Weighing costs and benefits: Incentive systems will typically gener-
ate a mix of costs and benefits that have to be weighed against
each other to determine the net value of the system.
TESTS AS PERFORMANCE MEASURES
The tests that are typically used to measure performance in educa-
tion fall short of providing a complete measure of desired educational
OCR for page 3
3
SUMMARY
outcomes in many ways. This is important because the use of incentives
for performance on tests is likely to reduce emphasis on the outcomes that
are not measured by the test.
The academic tests used with test-based incentives obviously do not
directly measure performance in untested subjects and grade levels or
development of such characteristics as curiosity and persistence. How-
ever, those tests also fall short in measuring performance in the tested
subjects and grades in important ways. Some aspects of performance in
many tested subjects are difficult or even impossible to assess with current
tests. And even for aspects of performance that can be tested, practical
constraints on the length and cost of testing make it necessary to limit the
content and types of questions. As a result, tests can measure only a subset
of the content of a tested subject.
When incentives encourage teachers to focus narrowly on the mate-
rial included on a particular test, scores on the tested portion of the con-
tent standards may increase while understanding of the untested portion
of the content standards may stay the same or decrease. To the extent
feasible, it is important to broaden the range of material included on tests
to better reflect the full range of what students are expected to know and
be able to do. And it is important to remember that the scores on the tests
used with incentives may give an inflated picture of learning with respect
to the full range of the content standards.
Incentives for educators are rarely attached directly to individual test
scores; rather, they are usually attached to an indicator that combines and
summarizes those scores in some way. Attaching consequences to differ-
ent indicators created from the same test scores can produce dramatically
different incentives. For example, an indicator constructed from average
test scores or average test score gains will be sensitive to changes at all
levels of achievement. In contrast, an indicator constructed from the per-
centage of students who meet a performance standard will be affected
only by changes in the achievement of the students near the cut score
defining the performance standard.
Given the broad outcomes that are the goals for education, the neces-
sarily limited coverage of tests, and the ways that indicators constructed
from tests focus on particular types of information, it is prudent to con -
sider designing an incentive system that uses multiple performance
measures. Incentive systems in other sectors have evolved toward using
increasing numbers of performance measures on the basis of their experi -
ence with the limitations of particular performance measures. Over time,
organizations look for a set of performance measures that better covers
the full range of desired outcomes and also monitors behavior that would
merely inflate the measures without improving outcomes.
OCR for page 4
4 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
INCENTIVE PROGRAMS REVIEWED
The committee’s literature review focused on studies that allowed us
to draw causal conclusions about the overall effects of test-based incentive
programs. We looked specifically for information about outcomes other
than the high-stakes tests that have incentives attached in order to avoid
having our conclusions biased by the test score inflation that the incen-
tives may have caused. We also attempted to contrast different incentive
programs according to the key features identified by the basic research
in economic theory (the first four features noted above): who is targeted
by the incentives, what performance measures are used, what conse-
quences are used, and what support is provided. The existing literature
did not allow us to contrast incentive programs according to the way they
frame and communicate incentives, the key feature identified by the basic
research in psychology (the fifth feature noted above).
We focused on 15 test-based incentive programs, including the large-
scale policies of NCLB, its predecessors, and state high school exit exams,
as well as a number of experiments and programs carried out in both the
United States and other countries. These various programs involved a
number of different incentive designs and substantial numbers of schools,
teachers, and students.
CONCLUSIONS
Conclusion 1: Test-based incentive programs, as designed and
implemented in the programs that have been carefully studied,
have not increased student achievement enough to bring the
United States close to the levels of the highest achieving coun-
tries. When evaluated using relevant low-stakes tests, which
are less likely to be inflated by the incentives themselves, the
overall effects on achievement tend to be small and are effec-
tively zero for a number of programs. Even when evaluated
using the tests attached to the incentives, a number of programs
show only small effects. Programs in foreign countries that
show larger effects are not clearly applicable in the U.S. context.
School-level incentives like those of the No Child Left Behind
Act produce some of the larger estimates of achievement effects,
with effect sizes around 0.08 standard deviations, but the mea-
sured effects to date tend to be concentrated in elementary
grade mathematics and the effects are small compared to the
improvements the nation hopes to achieve.
Conclusion 2: The evidence we have reviewed suggests that
high school exit exam programs, as currently implemented in
OCR for page 5
5
SUMMARY
the United States, decrease the rate of high school graduation
without increasing achievement. The best available estimate
suggests a decrease of 2 percentage points when averaged over
the population. In contrast, several experiments with providing
incentives for graduation in the form of rewards, while keep -
ing graduation standards constant, suggest that such incentives
might be used to increase high school completion.
RECOMMENDATIONS FOR POLICY AND RESEARCH
The modest and variable benefits shown by test-based incentive pro-
grams to date suggest that such programs should be used with caution
and that substantial further research is required to understand how they
can be used successfully.
Recommendation 1: Despite using them for several decades,
policy makers and educators do not yet know how to use test-
based incentives to consistently generate positive effects on
achievement and to improve education. Policy makers should
support the development and evaluation of promising new
models that use test-based incentives in more sophisticated
ways as one aspect of a richer accountability and improvement
process. However, the modest success of incentive programs
to date means that all use of test-based incentives should be
carefully studied to help determine which forms of incen-
tives are successful in education and which are not. Continued
experimentation with test-based incentives should not displace
investment in the development of other aspects of the educa-
tion system that are important complements to the incentives
themselves and likely to be necessary for incentives to be effec-
tive in improving education.
Recommendation 2: Policy makers and researchers should
design and evaluate new test-based incentive programs in ways
that provide information about alternative approaches to incen-
tives and accountability. This should include exploration of the
effects of key features suggested by basic research, such as who
is targeted for incentives; what performance measures are used;
what consequences are attached to the performance measures
and how frequently they are used; what additional support
and options are provided to schools, teachers, and students in
their efforts to improve; and how incentives are framed and
communicated. Choices among the options for some or all of
OCR for page 6
6 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
these features are likely to be critical in determining which—if
any—incentive programs are successful.
Recommendation 3: Research about the effects of incentive pro-
grams should fully document the structure of each program and
should evaluate a broad range of outcomes. To avoid having
their results determined by the score inflation that occurs in the
high-stakes tests attached to the incentives, researchers should
use low-stakes tests that do not mimic the high-stakes tests to
evaluate how test-based incentives affect achievement. Other
outcomes, such as later performance in education or work and
dispositions related to education, are also important to study. To
help explain why test-based incentives sometimes produce neg-
ative effects on achievement, researchers should collect data on
changes in educational practice by the people who are affected
by the incentives.