Higher education is a linchpin of the American economy and society: Teaching and research at colleges and universities contribute significantly to the nation’s economic activity, both directly and through their impact on future growth; federal and state governments support teaching and research with billions of taxpayers’ dollars; and individuals, communities, and the nation gain from the learning and innovation that occur in higher education.
Effective use of resources is (and should be) a serious concern in the delivery of higher education, as it is for other sectors of the economy. In the current environment of increasing tuition and shrinking public funds, a sense of urgency has emerged to better track the performance of colleges and universities in the hope that their costs can be contained while not compromising quality or accessibility. Metrics ranging from graduation rates to costs per student have been developed to serve this purpose. However, the capacity to assess the performance of higher education institutions and systems remains incomplete, largely because the inputs and outputs in the production process are difficult to define and quantify. For higher education, productivity improvement—increasing the number of graduates, amount of learning, and innovation relative to the inputs used—is seen as the most promising strategy in the effort to keep a high-quality college education as affordable as possible.
It was within this context that this panel was charged to identify an analytically well-defined concept of productivity for higher education and to recommend practical guidelines for its measurement. The objective is to construct valid productivity measures to supplement the body of information used to (1) guide
resource allocation decisions at the system, state, and national levels and to assist policy makers who must assess investments in higher education against other compelling demands on scarce resources; (2) provide administrators with better tools for improving their institutions’ performance; and (3) inform individual consumers and communities to whom colleges and universities are ultimately accountable for private and public investments in higher education. Though it should be noted that the experimental measure developed in this report does not directly advance all of these objectives—particularly that pertaining to measurement of individual institution perfomance—the overall report pushes the discussion forward and offers first steps.
While the panel is in no way attempting to design an accountability system, it is important to think about incentives that measures create. Since institutional behavior is dynamic and directly related to the incentives embedded within measurement systems, steps have to be taken to (1) ensure that the incentives in the measurement system genuinely support the behaviors that society wants from higher education institutions, and (2) maximize the likelihood that measured performance is the result of authentic success rather than manipulative behaviors. Clearly, a single high-stakes measure is a flawed approach in that it makes gaming the system simpler; a range of measures will almost always be preferable for weighing overall performance. While not diminishing the weight of these cautions, it should be noted that monitoring productivity trends would not be adding incentives to a world without them. Among the major incentives now in play are to enroll students, get research grants, improve in national rankings, raise money, and win athletic competitions. The panel believes that adding another incentive (and one more worthy than a number of these) will help round out the current set in a positive way.
Improving and implementing productivity metrics begins with recognition of their role in the broader performance assessment picture:
- Productivity should be a central part of the higher education conversation.
- Conversations about the sector’s performance will lack coherence in the absence of a well-vetted and agreed-upon set of metrics, among which productivity is essential.
- Quality should always be a core part of productivity conversations, even when it cannot be fully captured by the metrics.
- The inevitable presence of difficult-to-quantify elements in a measure should not be used as an excuse to ignore those elements.
The first step is to define key terms by applying the standard economic concept of productivity to higher education. In the model developed in this report, the base-
line productivity measure for the instructional component of higher education is estimated as the ratio of (a) changes in the quantity of output, expressed to capture both degrees (or other markers of successful completion) and passed credit hours to (b) changes in the quantity of inputs, expressed to capture both labor and nonlabor factors of production. The assumption embedded in the numerator, consistent with the economics literature on human capital (e.g., Bailey et al., 2004; Barro and Lee, 2010a), is that education adds to a student’s knowledge and skill base, even if it does not result in a degree. Key to the denominator is the heterogeneity of labor and other inputs used in the production of education—and the need to account for it.
The proposed approach should be viewed as a starting point; additional research will be essential for addressing a number of thorny issues that impede full and accurate productivity measurement and, in turn, its value for guiding policy. However, it is not premature to introduce a statistical construct to serve as a foundation for work on the topic. Indeed, having such a construct will guide data collection and research upon which the measures must be based.
A number of complexities characterize higher education production processes. These reflect the presence of (1) joint production—colleges and universities generate a number of outputs (such as educated and credentialed citizens, research findings, athletic events, hospital services), and the labor and other inputs involved cannot always be neatly allocated to them; (2) high variability in the quality and characteristics of inputs, such as teachers and students, and outputs, such as degrees; and (3) outputs (and inputs) of the production process that are nonmarket in nature. As is the case with other sectors of the economy, particularly services, productivity measurement for higher education is very much a work in progress in terms of its capacity to handle these complexities. Because no single metric can incorporate everything that is important, decision makers must appeal to a range of statistics or indicators when assessing policy options—but surely a well-conceived productivity measure is one of these.
Reflecting policy information needs as well as feasibility-of-measurement constraints, this study focuses on the instructional mission. By not directly accounting for other contributions of higher education to society—perhaps most notably research—the baseline model developed in this report omits a central mission of a large subset of institutions. Commentators such as Jonathan Cole have argued that research capacity is the primary factor distinguishing U.S. uni-
versities from those in the rest of the world, and that the country’s future depends strongly on the continued nurturing of its research-intensive universities (Cole, 2010). Indeed, this is why the federal government and state governments have invested and continue to invest billions of dollars in university-based research.
The decision to limit the report’s focus to measurement of instructional productivity is not intended as a comment on the relative importance of teaching, research, and public service for institutions with multiple missions. However, the focus on instruction does come with the analytical consequence that the resulting productivity measure can provide only a partial assessment of the sector’s aggregate contributions to national and regional objectives. For this reason, just as the performance and progress of the instructional capabilities of institutions must be monitored, measures should also be developed for assessing the value of the nation’s investments in research. Even for a purely instruction-based measurement objective, an improved understanding of faculty resource allocation to research is essential because time use is not fully separable, and because research intensities may affect the quality of teaching.
Historically, institution or system performance has been assessed using unidimensional measures such as graduation rates, time to degree, and costs per credit. When attention is overwhelmingly focused on completions or costs, the risk is raised that stated goals will be pursued at the expense of quality. For this reason, input and output quantity measures should ideally be adjusted to reflect quality differences; that is, productivity should be defined as the ratio of quality-adjusted outputs to quality-adjusted inputs. However, such measurement is extremely difficult, which means that developing data and methods for doing so is a very long-term project. In the meantime, while accounting is incomplete, it is essential to monitor when apparent increases in measurable output arise as a result of quality reduction. For the foreseeable future, this will have to be done through parallel tracking of additional information generated independently by universities and third party quality assurance methods. And, until adjustments can be made to productivity metrics to account for quality differences, it will be inappropriate to rely exclusively on them when making funding and resource reallocation decisions. To do so would risk incentivizing a “race to the bottom” in terms of quality.
In some ways, the situation has not changed significantly in 100 years. A 1910 Carnegie Foundation report attempted to develop a time-use accounting formula to estimate the costs and outputs of higher education in order to “measure the efficiency and productivity of educational institutions in a manner similar to that of industrial factories.” The authors of that volume struggled with measuring quality and, while forced to confine their observation largely to quantity, did strive “to make quality a background for everything that may appear to
have only a quantitative value” (cited in Barrow, 1990:67). A century later, we agree: measuring quality is difficult, which explains why adequate productivity measures, as well as the data and methodologies on which they rest, have yet to be constructed for the sector.
Because the productivity measure developed in this report expresses outputs in terms of quantities of credits and degrees, it does not explicitly take account of changing quality of outputs or inputs. An effect will be captured to the extent that higher quality inputs lead to higher graduation rates, but this effect is indirect. For example, if small classes or better teaching (inputs of different quality) lead to higher graduation rates, this will figure in the output total (the numerator) as a greater sheepskin effect—that is, an added value assigned for degree completion. Similarly, high student and teacher quality at selective private institutions may offset high input costs by creating an environment conducive to high throughput (graduation) rates.
This modest step notwithstanding, continued research to improve measurement of the quality dimension of higher education is essential. For output quality, researchers should aim to identify and quantify student learning outcomes, readiness for subsequent coursework and employment, degree- and credit-related income effects, and the social value of education. Similarly, adjustments should be estimated to reflect the quality of inputs, most notably the mix of students (along such dimensions as preparedness and socioeconomic background) and the effectiveness of faculty instruction. A conventional approach, widely applied in the empirical literature, is to use SAT scores and other indicators of student quality (e.g., high school rank, ethnicity, socioeconomic variables such as educational status and income of parents) to statistically impose the needed adjustments. For this reason, much could be learned from more complete school-wide censuses capturing demographic and preparedness measures for incoming students. In the spirit of monitoring quality (in this case, of the student input) in parallel with the proposed productivity statistic, student distributions could be reported at the quartile or quintile level so as not to make reporting excessively costly.
Further complicating accurate valuation of higher education is that some of the benefits of schooling are nonpecuniary and nonmarket in nature—they are not bought and sold and do not have prices. Additionally, externalities arise, in that not all of the benefits of an educated citizenry accrue to those paying for or receiving education. Nonetheless, policy makers should be concerned with social value, not just the private or market value of the outcomes generated by higher education. For this reason, valuing degrees solely by salaries that graduates earn is misleading. Investment in citizens’ careers is not the only objective, from a societal perspective, of supporting and participating in higher education. The nonpecuniary and public goods aspects of higher education output, such as those
linked to research, are also important; even the consumption component of college, including student enjoyment of the experience, is quite clearly significant.
The measurement complications identified above can be dampened by recognizing the diversity of missions across the range of colleges and universities and then segmenting institutions into more homogeneous categories along the lines of the Carnegie Classification system, or perhaps using even more detail. For many purposes, it is unwise to compare performance measures across institutions that have different missions. The first implication of this principle is that productivity measures must be designed to register outcomes that can be taken as equivalent to a degree or a fractional portion of a degree-equivalent. This may be especially important for community colleges, where outcomes include successful transfer to four-year institutions, completion of certificates, or attainment of specific skills by students who have no intention of pursuing a degree.
Additionally, for purposes of making comparisons across institutions, states, or nations, it is essential to take into account incoming student ability and preparation. Highly selective institutions typically have higher completion rates than open-access institutions. This may reflect more on the prior learning, preparation, and motivation of the entrants than on the productivity of the institution they enter. Therefore, in the context of resource allocation or other high stakes decisions, the marginal success effect attributable to this input quality effect should ideally be taken into consideration in performance assessments.
Because heterogeneity leads to measurement complications even within institutional categories, it is also important to account for differences in factors such as the mix of degrees and majors. Institution-level cost data indicate that the resources required to produce an undergraduate degree vary, sometimes significantly, by major. Variation in degree cost is linked to, among other things, systematic differences in the amount of time needed to complete a degree. Uninformed comparisons will result in some institutions appearing less efficient in terms of degree production (i.e., exhibiting longer time values), yet they may be functioning reasonably well, given their mission and student characteristics. Therefore, productivity models should include an adjustment for field of study that reflects, among other things, different course requirements, pass rates, and labor input costs.
It is possible, and perhaps even likely, that critics of this report will rebuke the idea of measuring instructional productivity because of the complications noted above and throughout this report. Our view is that this would be a mis-
take. Failure to implement a credible measure may indefinitely defer the benefits achievable from a better understanding of quantitative productivity, even in the absence of a viable method of quality adjustment. We emphasize again the essential idea that effective and transparent quality assurance systems should be maintained to supplement productivity and other performance measures. This will allow progress to be made in measuring the quantitative aspects of productivity while containing the risk of triggering institutional competition that results in lowering educational quality. Progress on the development of quantitative productivity measures may also boost the priority for developing a serviceable quality adjustment index.
While progress can be made to develop and implement productivity measures using existing information, full implementation of the recommendations in this report will require new or improved data capabilities as well. One significant change required for enhancement of the baseline model involves standardizing the capacity to link credit hours to degree or field. To move in this direction, institutions should collect credit-hour data in a way that follows students, and not only the departments that teach them. Indeed, the necessary information already exists in many institutions’ student registration files. To fully exploit the potential from this kind of information, the Integrated Postsecondary Education Data System (IPEDS) produced by the National Center for Education Statistics could report these data along with the numbers of degrees awarded.
Detailed productivity measurement will require other kinds of information as well, such as comprehensive longitudinal student databases (to better calculate graduation rates and estimate the cost and value of degrees) and more accessible administrative sources. The potential of administrative data sources—maintained at various levels, ranging from institutions’ accounting and management systems to those of the federal statistical agencies—depends heavily on the ability of researchers and policy analysts to link records across state boundaries and across elementary, secondary, postsecondary, and workforce boundaries (Prescott and Ewell, 2009). Standardization and coordinated linkage of states’ student record databases should be a priority. Another example of useful administrative data is unemployment insurance records kept by all states. As with individual state unit record data resources for postsecondary education, it is now often difficult to assemble multi-state or national datasets. This makes it difficult to track cohorts of graduates (or nongraduates) across state lines. The Bureau of Labor Statistics should continue to do what it can to facilitate multi-state links of unemployment insurance wage records and education data which would create new opportunities for research on issues such as return on investment from postsecondary training or placement rates in various occupations.