indicator.26 It differs from a true value-added indicator in that it fails to control for differences across schools and over time in average student and family characteristics. In general, it makes sense to compute a gain indicator only when the tests administered at different grade levels are scored using a common scale. The NAEP tests fulfill this criterion.

As indicated in Table 10.2, average tests scores for eleventh graders exhibit the by-now familiar pattern of sharp declines from 1973 to 1982 and then partial recovery between 1982 and 1986. The eleventh-grade data, by themselves, are fully consistent with the premise that academic reforms in the early and mid-1980s generated substantial gains in student achievement. In fact, an analysis of the data based on a more appropriate indicator than average test scores suggests the opposite conclusion. The gain indicator reveals that achievement growth during the 1982–1986 period was actually no better than that during the 1978–1982 period and that gains from grades 7 through 11 were actually slightly lower from 1982 to 1986 than in previous periods! The rise in eleventh-grade math scores from 1982 to 1986 stems from an earlier increase in achievement for the same cohort of students rather than from an increase in achievement from grades 7 through 11. In short, these data provide no support for the notion that high school academic reforms during the mid-1980s generated significant increases in test scores. Moreover, the analysis underscores that in practice, not just in theory, average test scores can be a highly fallible measure of school performance.

Policy Implications

Consequences of Using Flawed Indicators

The fact that level indicators measure school performance with potentially enormous error has important implications for their use in making education policy, informing students and parents about the quality of schools, and evaluating school performance as part of an accountability system. With respect to policymaking, it is clear that level indicators potentially provide wholly incorrect information about the success or failure of educational interventions and reforms. They could lead to the expansion of programs that do not work or to the cancellation of ones that are truly effective. Similarly, level indicators are likely to give students and parents erroneous signals about which schools to attend. Either academically advantaged and disadvantaged students could be fooled into abandoning an excellent neighborhood school simply because the school served stu


Gain indicators were constructed by computing the change in average test scores over time for given birth cohorts. The NAEP was originally designed to permit this type of analysis. The mathematics tests have generally been given every four years at grade levels spaced four years apart, except in 1974. For the illustrative analysis here, it is assumed that average test scores in 1973 are comparable to the unknown 1974 scores.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement