and Levitt, 2003). These types of behaviors may be the reason that the recent National Research Council (2011) panel on school accountability expressed a skeptical view about accountability while recognizing the positive gains associated with these policies.

One potential solution emerging from the K-12 literature is that “value added” measures of outcomes tend to be less manipulable than are measures based on average levels of performance or proficiency counts. The rationale is that when schools are evaluated based on their gains from year to year, any behaviors generating artificial improvements would need to be accelerated in order for the school to continue to show gains the next year. In higher education, however, this year’s post-test is not next year’s pre-test, so there remains the very real possibility that institutions could manipulate their outcomes (or their inputs) in order to look better according to the accountability system; and while value added measures might allow for more apples-to-apples comparisons among institutions, they will not reduce the strategic behavior problem by as much as they might in K-12 education.

One example of how higher education institutions respond strategically to the incentives embedded within an evaluation system is observable in relation to the U.S. News & World Report rankings. Grewal, Dearden, and Lilien (2008) document ways in which universities strategically deploy resources in an attempt to maximize their rankings. Avery, Fairbanks, and Zeckhauser (2003) and Ehrenberg and Monks (1999) find that the ranking system distorts university admissions and financial aid decisions.

If institutions make failure more difficult by implementing systems of support to help struggling students improve, this is a desired outcome of the accountability system. If instead they act in ways that dilute a curriculum, or select students who are likely to help improve the institution’s ranking, this could be a counterproductive consequence of the system. The more background characteristics are used to predict graduation rates, the harder this manipulation would become, but on the other hand, only a small number of background factors are currently available on a large scale.

To sum up, many proxy measures of productivity have been constructed over the years. They have some utility in comparing institutions and programs, if used cautiously and with knowledge of their drawbacks. But experience has shown that they can result in major misunderstandings and the creation of perverse incentives if applied indiscriminately. As with productivity measurement itself, these proxies are significantly affected by context. Among the most important contextual variables that must be controlled for are institutional selectivity, program mix, size, and student demographics. The model outlined in Chapter 4 suggests approaches for dealing with some of the shortcomings of traditionally used performance measures. Part-time students are treated as partial FTEs; semester of entry does not create distortions; and successful transfers are accounted for through awarding bonus points analogous to the sheepskin effect for bachelor’s degrees.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement