This study has two major objectives: to present an analytically well-defined concept of productivity in higher education and to recommend empirically valid and operationally practical guidelines for measuring it. In addition to its obvious policy and research value, improved measures of productivity may generate insights that potentially lead to enhanced departmental, institutional, or system educational processes. In pursuit of these objectives, we address a series of questions: What is productivity and how can the concept of productivity be applied to higher education? What limitations and complexities are confronted when attempting to do so? Why is the measurement of productivity important to education policy? Who should care about measuring productivity? And, how can the measurement of productivity be improved?
These questions are not new. Indeed, 2010 marked the 100th anniversary of the Carnegie Foundation Report (Cooke, 1910), which developed a time-use accounting formula to estimate the costs and outputs of higher education for both teaching and research. Essentially, the Carnegie Foundation Report sought “to measure the efficiency and productivity of educational institutions in a manner similar to that of industrial factories” (Barrow, 1990:67). One goal of this earlier effort was to create a method for measuring productivity so that higher education would be subject to and benefit from competitive market pressures akin to those in private industry. To accomplish this, the Carnegie Foundation Report created a key unit of measure called the student hour, defined as “one hour of lectures, of lab work, or recitation room work, for a single pupil” (Barrow, 1990:70). The motivation behind the initiative was to facilitate calculation of relative faculty workloads, the cost of instruction per student hour, and, ultimately, the rate of educational efficiency for individual professors, fields, departments, and universities (Shedd, 2003). These are the essentially the same things we want to know
today and which this report again addresses. Additionally, the difficult measurement issues limiting completeness of the analysis 100 years ago are still very much in play, as we detail in Chapter 3.
While productivity measurement in many service sectors is fraught with conceptual and data difficulties, nowhere are the challenges—such as accounting for input differences, wide quality variation of outputs, and opaque or regulated pricing—more imposing than for higher education. Compounding the challenge is that many members of the panel (and many reading the report) are being asked to consider the same measurement tools to analyze their own industry as they would use in analyzing any other. And, from up close, the complexities are much more apparent than when dissecting productivity from a distance.
One lesson drawn from this effort is that we may be too sanguine about the accuracy or relevance of measures of productivity in other sectors, having seen how daunting they can be in a setting with which we are more intimately familiar. The conceptual and practical problems surrounding this effort raise additional concerns because it is known that measurements create incentives, incentives change practices, and those practices have the potential to affect people and institutions we care deeply about. Yet the current higher education environment is not without incentives, many of which have flaws that are at least as profound and distorting as those associated with economic measurement, and are sometimes much worse. Readers of the report will have to make the up their minds whether the potential disadvantages of this approach, as well as the costs of implementing the specific recommendations, are worth the potential benefit. While we understand how some might come to a different conclusion, we believe the advantages outweigh the disadvantages.
Not everything that counts can be counted, and not everything that can be counted counts.
—William Bruce Cameron
While this observation is broadly profound, it seems exceptionally applicable to the case of higher education. At the same time, a better understanding of the workings and nature of the sector is necessary, given its prominent role in the economy and impact on the future of our society. Higher education is part of the essential fabric of American experience, one in which many citizens spend a significant fraction of their adult lives. For many individuals, higher education is the largest or second-largest consumer decision.
On an aggregate level, colleges and universities employ around 3.6 million individuals, 2.6 million of those in professional positions.1 The sector accounts
(directly) for about 3.3 percent of gross domestic product (Soete, Guy, and Praest Knudsen, 2009), which makes it larger than a number of industries for which productivity data are routinely collected. It also accounts for about 10 percent of state budgets in recent fiscal years (National Association of State Budget Officers State Expenditure Report, 2011).
Beyond the production of credentialed citizens, academic institutions also perform much of the nation’s research and development. In 2008, colleges and universities spent $52 billion on research and development, with 60 percent of this funding derived from the federal government. Academic institutions performed 55 percent of basic research and 31 percent of total research (basic plus applied) in the United States (National Science Board, 2010:5-4). Although nonacademic organizations conduct research in select functional fields such as health, defense, space, energy, and agriculture, the general prominence of academic research and the established funding patterns reflect a post–World War II political consensus that federally funded basic research is most effectively performed in academic institutions. This contrasts with patterns observed elsewhere in the world, where there is greater reliance on government-operated laboratories, other forms of public research organizations, or industry to conduct research.
In the current global economic and fiscal climate, the attention being paid by policy makers to the competitiveness and general state of higher education in the United States continues to heighten. Recent research (e.g., Carnevale, Smith, and Strohl, 2010) indicates that the economy’s expanding sectors and industries rely disproportionately on workers with higher education credentials. During the current recession, characterized by high and persistent unemployment, analyses of evidence such as online job postings and real-time jobs data reveal a mismatch between job openings and the educational credentials of the workforce. Higher education institutions themselves have become increasingly concerned about improving their own performance, competing with peer institutions on cost and quality, and providing a degree of public accountability.
In this environment of strong policy maker and institutional interest in the performance of higher education, stakeholders have used whatever data and measures are available in an attempt to understand trends and perceived problems; for better or worse, some version of productivity will be measured. Therefore, it is crucial to develop coherent measurement tools that make the best possible use of available and potentially available data. Failure to do so will keep the door open for an ever-expanding profusion of measures, many of them unnecessarily distortive, and endless debates about measurement as opposed to productivity itself.
Currently in policy debates, administration discussions, and media coverage, attention tends to focus on the soaring sticker price of college (overall costs have remained more or less in line with general inflation). Cost per degree, graduation rates, and retention metrics have been used as though they measured efficiency or overall productivity. What is often ignored in these discussions is the quality of higher education instruction. When attention is overwhelmingly focused on
completions or similar metrics, the risk is heightened that the stated goal will be pursued at the expense of quality.2 If the aim is to know whether increased spending is resulting in commensurate returns, the quantity and quality of the sector’s inputs and outputs must be reliably tracked, which, for the latter, requires developing assessment tools for quantifying the outcomes of higher education.
Used without a solid understanding of their meaning in divergent contexts, simple metrics such as graduation rates and costs per degree can distort and confuse as much as they inform. In the absence of more rigorous alternatives, however, they will continue to be used—and, at times, misused. In this report, we take a closer look at some of the unidimensional performance metrics to understand better what exactly they reveal. We then develop a more appropriate approach to productivity measurement—one that can serve as a key component in the set of information from which to base resource and other policy decisions. However, even the productivity measure developed in this report, which expresses outputs in terms of quantities of credits and degrees, cannot explicitly take account of quality variation and change. As detailed in Chapter 4, an effect will be captured by the proposed measure to the extent that higher quality inputs, such as better teachers, lead to higher percentages of students completing degrees; but this effect is indirect. Thus, a single metric—even a well-conceived productivity measure—will rarely be sufficient, on its own, to adequately serve as a comprehensive assessment of institutional, system, or even sector-wide performance. Other factors—most notably the quality dimension—must be monitored through parallel tracking of information that will often have to be processed independently from the productivity metric.
Finally, there are aspects of human and, more narrowly, productive enterprise that create social value but that statisical measures do not and indeed do not presume to capture. From a societal perspective, investment in citizens’ work careers is not the only motivation for supporting and participating in higher education. Nonpecuniary components of the sector’s output assoicated with instruction, research, and other public goods are also important. Like a policeman who brings extraordinary passion to protection of fellow-citizens, a technology entrepreneur whose vision ultimately changes the way people live, or an artist who is appreciated long after creating the art, the passion and dynamism of a master teacher who is truly interested in a student who, in turn, is truly interested in learning cannot be richly portrayed in a number. In this context, some very real elements of the value of experiencing life-changing learning cannot be fully quantified within a (still very important) statistical infrastructure.
2Similar tendencies to focus on the easily quantifiable hamper discussions of medical care. The increase in costs is known; the value gained from these expenditures, in terms of health benefits to the population, frequently is not.
The statement of task for this project—co-developed by the Lumina Foundation for Education and the National Research Council’s Committee on National Statistics at a planning meeting held February 20, 2009—reads as follows:
The Panel on Improving the Measurement of Productivity in Higher Education will develop a conceptual framework for measuring higher education productivity and describe the data needs for that framework. The framework will address productivity at different levels of aggregation, including the institution, system, and sector levels.
An overarching goal of the study is to catalogue the complexities of measuring productivity and monitoring accountability in higher education. In particular, the study will take into account the great variety of types and missions of higher education institutions in the United States, ranging from open admission colleges to major research universities that compete on an international scale. The study will also address the necessity to consider quality issues when attempting to measure productivity. Since the quality of inputs to and outputs from higher education varies greatly across institution types and, indeed, within them, the study will highlight the pitfalls of using simplistic metrics based on insufficient data for evaluating the performance of higher education.
One objective of the study will be to provide guidance to institutions and policy makers about practical measures that can be developed for the purposes of institutional improvement and accountability. However, to the extent that the differences in inputs, outputs, and institution types within higher education (along with inadequate data) make the development of comprehensive productivity measures impossible, the panel will assess the strengths and weaknesses of the various alternatives in providing evidence on different aspects of the input-output relationship.
At the conclusion of its study, the panel will issue a report with findings and recommendations for developing the conceptual framework and data infrastructure and that provides an assessment of the strengths and limitations of alternative approaches to productivity measurement in higher education. The report will be written for a broad audience that includes national and state policy makers, system and institution administrators, and higher education faculty.
An important aspect of this report is to highlight the complexity of measuring productivity in higher education. A deeper understanding of this complexity reduces the chances that decision makers will misuse measures—for example, by incentivizing “diploma mills” through overemphasis of graduation rate or time-to-degree statistics in accountability policies. While attempting to provide novel insights into productivity measurement, we are cognizant that it is easy to find fault with approaches that policy makers, administrators, and other practitioners have relied upon to do their jobs. It is much more difficult to envision and
implement new methods that could become broadly endorsed. Recognizing that funding and personnel decisions, as well as plans to improve resource allocation are sometimes based at least in part on these measures, our intent is to encourage those attempting to improve and apply them in real policy settings.
Due to the sheer breadth of activities associated with higher education in the United States, this report cannot be exhaustive. The scope of the study and the recommendations herein reflect policy information needs as well as feasibility-of-measurement constraints. The report’s purview includes all types of higher education institutions (public, private, for-profit), but not all missions. Our measurement prescriptions focus on instruction, which includes all taught programs, regardless of level (e.g., associate, bachelors, taught terminal masters).3 Joint production of instruction, research, and public service is discussed in detail, though it is recognized that measurement of the latter two is largely beyond the scope of the panel’s charge. Other missions, such as health care and athletics, which sometimes are budgeted separately, are also excluded from our measurement proposals, which mean that any synergies that exist between these activities and conventional resident instruction programs are missed. To include them at this point in the development of productivity measurement for the sector would hopelessly complicate the task.
In developing a model of productivity (Chapter 4), the panel recognizes that this is only a starting point for what promises to be a long-term research agenda. It is worth pointing out that no industry is without its complexities, and no productivity measure currently in use is permanently fixed. The extensive and impressive research by the Bureau of Labor Statistics (BLS) into the concepts and techniques of productivity measurement is indicative of the ongoing process and continuing progress but also of the fact that measurement and conceptual barriers remain.4 Additionally, as described in the next chapter, more than one paradigm exists for constructing productivity models.5 It is especially worth distinguishing between aggregate models of the kind developed here, which are designed to measure long-term trends, and structural models aimed more specifically at operational improvement and accountability concerns. Aggregate and sector-level productivity models have proved to be important for economic and policy analysis. In higher education, for example, they reveal whether resource usage per unit of output in particular institutional segments
3Application of the model developed in Chapter 4 uses IPEDS data that do not exclude Ph.D. and research degrees (though they clearly have a quite different teaching production function). Due to the way universities categorize instructional expenses, it is not possible to subdivide these activities on the input side and, therefore, these degrees are not excluded from the output side either (they are also included in the student credit-hour production figures). However, it is doubtful that the small number of degrees and enrollments involved will have much effect on the actual productivity statistics.
4For details of the BLS work, see http://www.bls.gov/lpc/lprarch.htm#Concepts_and_Techniques_of_Productivity [June 2012].
5OECD (2001) provides a thorough overview of aggregate and industry-level productivity measures.
has been increasing or declining. The model may not reveal why this is so, but at the very least it pushes us to ask additional, useful questions. However, these kinds of models are not typically intended to be used for accountability or incentivizing purposes—especially for applications such as higher education where output prices do not necessarily reflect quality. In contrast, the structural models involve a fairly detailed representation of an entity’s internal structure, and thus require more granular data. Such models also generally focus on marginal revenues and marginal costs, as opposed to the average revenues and costs considered in the aggregate models. As noted above, the panel was not charged with developing a structural model and has not attempted to do so.
At a conceptual level, this report dedicates considerable attention to productivity measurement at different levels of aggregation, including the institution, system, and sector levels. For most purposes, it is necessary to segment the sector by institution type to avoid inappropriate comparisons. However, the measure developed in Chapter 4 is focused on productivity of the sort typically applied to aggregate economic sectors (e.g., autos, steel, higher education), which rests on the methodology used by the BLS. While one can imagine aggregating institution-level data to produce a macro productivity measure, such an approach is not practical at the present time for the higher education sector. As a technical matter, there is nothing to prevent the model developed here from being applied at the level of a state, system, or individual institution, but this opens the way for it to be exploited for performance measurement without the proper support of additional quality measures. The panel generally believes that this risk associated with pushing forward with productivity measurment is worth taking, and that to maintain the “know-nothing” status quo would perpetuate dysfunctional behavior.
It is noteworthy that the panel was not charged with recommending processes to improve productivity, for example, through innovative new methods for designing courses or through online education. Similarly, the panel was not asked to develop models for monitoring departmental, institutional, or system activity; these are applications. One stumbling block to productivity measurement—and indeed, to productivity improvement—has been the widely-held view that, because learning is a service and its production is labor-intensive, colleges and universities suffer from a condition known as Baumol’s cost disease. The underlying theory, which breaks from the notion in classical economics that wage changes are closely tied to labor productivity changes, is that labor costs in some sectors of the economy are affected by productivity gains in other unrelated sectors. Those productivity gains drive an increase in wages across the entire economy. Sectors without productivity gains are nonetheless faced with a higher wage bill, making them appear less efficient.6 Archibald and Feldman (2011) subscribe to
6In their landmark book, Performing Arts: The Economic Dilemma, Baumol and Bowen (1966) use as an example a Mozart string quintet composed in 1787. More than two centuries later, it still requires five musicians and the same amount of time to perform the piece.
this view, noting that the production processes for colleges and universities rely on human interaction (at least traditionally), nearly fixed amounts of time inputs from faculty and students, and a key role for highly educated, highly compensated employees.
Even when steps can be taken to increase throughput, questions rightfully arise about the effect of the changes on quality. Archibald and Feldman write (2011:40):
An institution can increase class size to raise measured output (students taught per faculty per year) or it can use an increasing number of less expensive adjunct teachers to deliver the service, but these examples of productivity gain are likely to be perceived as decreases in quality, both in the quality rankings and in the minds of the students.
However, the evidence on the potential of higher education to benefit from new models of production, such as online courses, is not conclusive. Harris and Goldrick-Rab (2011) argue that “researchers and institutions themselves have rarely paid much attention to whether policies and practices are cost-effective. How would you know whether you’re spending money effectively if you’ve never even asked the question?” They conclude that colleges “can conceivably become more productive by leveraging technology, reallocating resources, and searching for cost-effective policies that promote student success.” Indeed, many industries that formerly were believed to be stagnant have been able to improve productivity dramatically. Even in the quintessential example of Baumol’s cost disease (noted above), string quartets have improved “productivity” dramatically through the capability to simulcast a performance to theaters or, more obviously, by recording their music and earning money while the musicians sleep (Massy, 2010:39). Other examples can be found in medical care, legal services, and elsewhere.
Work by the National Center for Academic Transformation (NCAT) on course redesign provides a contemporary example of what can be accomplished in this area (see Chapter 2 for a description of some of this work; see also Appendix B on NCAT’s methods). The organization’s clients analyze measures to determine new ways to combine inputs so as to produce student credit hours of the same or better quality than with traditional methods. Quality assurance also enters the process. Indeed, the changes that have been made following such analyses are the classic ones used in essentially all industries: shifts from high-cost to lower-cost labor, more intensive use of and better technology, and elimination of waste in resource utilization.
The idea that instructional productivity may potentially be increased by altering the way inputs in the production function are combined highlights why improved measurement is so important. Potential improvement in productivity also justifies requirements that colleges and universities systematize collection of data
on expenditures and the volume and quality of inputs and outputs. Routine generation and collection of such data is a prerequisite for wider efforts to improve productivity and enable external stakeholders to hold institutions accountable.
In the face of the observations laid out above, we take the following premises as the starting point for our assertion that improved information regarding the functioning of higher education is needed: (1) Those who fund higher education have a legitimate interest in meaningfully measuring productivity, both in order to make the best possible allocations and spending decisions within the sector, and to assess the value of higher education against other compelling demands on scarce resources; (2) Institutions, individuals, and communities whose economic well-being is most directly at stake when funding decisions are made have a legitimate interest in ensuring that measurements of productivity are accurate and appropriate. The analysis and recommendations in this study attempt to balance these interests.
This report has been written for a broad audience including national and state policy makers, system and institution administrators, higher education faculty, and the general public.
- State and federal legislators: Policy makers benefit from discussion that identifies important questions, explains the need for particular data programs, and clarifies the meaning of different performance metrics.
- College and university administrators: These decision makers are under increasing pressure to address accountability and productivity concerns. This report may provide authoritative backing to resist pressure to impose inadequate assessment systems just so as to be seen to be doing something. These groups may also benefit from guidance about what data to collect to support proposed evaluations of programs.
- Faculty: College and university professors need to understand the interaction between their own interests and society’s interests in the education enterprise. They need to be informed about innovative approaches to increasing mission efficiency through use of technology and other means. And they need quality information to guide them in the context of shared governance that prevails in most colleges and universities.
- General public: We hope that this report will promote a greater understanding of societal interests in higher education and of how the interests of stakeholders (students, faculty, administrators, trustees, parents, taxpayers) fit into that broader picture. The arguments herein may also promote a fuller understanding of the complexity of colleges and universities and how they benefit the economy and society.
The remainder of the report is organized as follows: In Chapter 2, we define productivity and then characterize the activities of higher education in terms of inputs or outputs. We pay particular attention to the heterogeneity of the sector, including the great range of its products and the changes and variation in the quality of its inputs and outputs. Accounting for all outputs of higher education is particularly daunting, as they range from research findings and production of credentialed citizens to community services and entertainment. Although the panel’s recommendations focus on degree output, research and other scholarly and creative activities must be acknowledged because they are part of the joint product generated by universities, and because they may affect the quality and quantity of teaching. We also contrast productivity with other measurements that have been used as proxies for it and discuss the merits and limitations of proxies currently in use.
In Chapter 3, we articulate why measurement of higher education productivity is uniquely difficult. Colleges and universities produce a variety of services simultaneously. Additionally, the inputs and outputs of higher education production processes are heterogeneous, mix market prices and intangibles, vary in quality, and change over time. Measurement is further impeded by conceptual uncertainties and data gaps. While none of these difficulties is unique to higher education, their severity and number may be. We detail the complexities—not to argue that productivity measurement in higher education is impossible, but rather to indicate the problems that must be overcome or mitigated to make accurate measurements.
This report will be instructive to the extent that it charts a way forward for productivity measurement. Toward this end, in Chapter 4, we provide a prototype productivity measure intended to advance the conceptual framework. The objective here is not to claim a fully specified, ideal measure of productivity, for such does not exist. Rather, we aim to provide a starting point to which wrinkles and qualifications can be added to reflect the complexity of the task, and to suggest a set of factors for analysts and policy makers to consider when using productivity measures or other metrics to inform policy.
In Chapter 5, we offer practical recommendations designed to advance measurement tools and the dialogue surrounding their use. We provide guidance for developing the basic productivity measure proposed in Chapter 4, targeting specific recommendations for the measurement of inputs and outputs of higher education, and discuss how changes in the quality of the range of variables could be better detected. A major requirement for improved measurement is better data. Thus, identifying data needs demanded by the conceptual framework, with due attention to what is practical, is a key part of the panel’s charge. This is addressed in Chapter 6. In some cases, the most useful measures would require data that do not now exist but that could feasibly be collected.