Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 19
2
Defining Productivity for Higher Education
The importance of productivity growth to an economy is widely recognized
because the extent to which living standards can be improved over time depends
almost entirely on the ability to raise the output of its workers.1 From the perspec-
tives of individual industries and enterprises, gains in productivity are a primary
means of offsetting increases in the costs of inputs, such as hourly wages or raw
materials. Likewise, in higher education, productivity improvement is seen as
the most promising strategy for containing costs in the continuing effort to keep
college education as affordable as possible. Without technology-driven and other
production process improvements in the delivery of service, either the price of
a college degree will be beyond the reach of a growing proportion of potential
students or the quality of education will erode under pressures to reduce costs.
In this environment, such concepts as productivity, efficiency, and account-
ability are central to discussions of the sustainability, costs, and quality of higher
education. The discussion should begin with a clear understanding of productivity
measures and their appropriate application, while recognizing that other related
concepts (such as unit cost, efficiency measures, and the like) are also important
and inform key policy questions.
At the most basic level, productivity is defined as the quantity of outputs
delivered per unit of input utilized (labor, capital services, and purchased inputs).
1 Average annual GDP growth for the United States, 1995-2005, was 3.3 percent. Estimates of the
contributions of the various components of this growth (Jorgenson and Vu, 2009) are as follows:
Labor quantity, 0.63; labor quality, 0.19; noninformation and communications (ICT) capital, 1.37;
ICT capital, 0.48; total factor productivity (TFP) growth, 0.63. These figures indicate the importance
of input quality and technology in per capita productivity gains.
19
OCR for page 20
20 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
The number of pints of blueberries picked or boxes assembled using an hour of
labor are simple examples. Productivity, used as a physical concept, inherently
adjusts for differences in prices of inputs and outputs across space and over time.
While productivity measures are cast in terms of physical units that vary over
time and across situations, efficiency connotes maximizing outputs for a given
set of fixed resources.2 Maximizing efficiency should be the same as maximizing
productivity if prices are set by the market (which is not the case for all aspects
of higher education). Accountability is a managerial or political term addressing
the need for responsibility and transparency to stakeholders, constituents, or to
the public generally.
Application of a productivity metric to a specific industry or enterprise can
be complex, particularly for education and certain other service sectors of the
economy. Applied to higher education, a productivity metric might track the vari-
ous kinds of worker-hours that go into producing a student credit hour or degree.
The limitation of this approach is that, because higher education uses a wide
variety of operational approaches, which in turn depend on an even wider variety
of inputs (many of them not routinely measured), it may not be practical to build
a model based explicitly and exclusively on physical quantities. Of even greater
significance is the fact that the quality of inputs (students, teachers, facilities) and
outputs (degrees) varies greatly across contexts.
A primary objective of industries, enterprises, or institutions is to optimize
the efficiency of production processes: that is, to maximize the amount of output
that is physically achievable with a fixed amount of inputs. Productivity improve-
ments are frequently identified with technological change, but may also be associ-
ated with a movement toward best practice or the elimination of inefficiencies.
The measurement of productivity presumes an ability to construct reliable and
valid measures of the volume of an industry's (or firm's) output and the different
inputs. Though productivity improvements have a close affinity to cost savings,
the concepts are not the same. Cost savings can occur as a result of reduction in
input prices, so that the same physical quantity of inputs can be purchased at a
lower total cost; they are also attainable by reducing the quantity or quality of
output produced. But, by focusing on output and input volumes alone, it becomes
difficult to distinguish efficiency gains from quality changes. To illustrate, con-
sider homework and studying. Babcock and Marks (2011) report that college
students currently study less than previously. Assuming studying is an input
to learning, does this mean that students have become more productive or now
2Kokkelenberg et al. (2008:2) write that: "Economists describe efficiency to have three aspects;
allocative efficiency which means the use of inputs in the correct proportions reflecting their marginal
costs; scale efficiency which considers the optimal size of the establishment to minimize long-run
costs; and technical efficiency which means that given the establishment size and the proper mix of
inputs, the maximal output for given inputs under the current technology is achieved." It should be
noted that that the productivity index approach, on its own, is unlikely to say much about optimal
size and scale efficiency.
OCR for page 21
DEFINING PRODUCTIVITY FOR HIGHER EDUCATION 21
shirk more? Arum and Roksa (2010) argue that college students are learning less,
implying the latter. But, without robust time series data on test results to verify
student learning, the question remains unanswered.
2.1. BASIC CONCEPTS
Several different productivity measures are used to evaluate the performance
or efficiency of an industry, firm, or institution. These can be classified as single-
factor productivity measures, such as labor productivity (the ratio of output per
labor-hour), or multi-factor productivity, which relates output to a bundle of in-
puts (e.g., labor, capital, and purchased materials). In addition, productivity can
be evaluated on the basis of either gross output or value added. Gross output is
closest to the concept of total revenue and is the simplest to calculate because it
does not attempt to adjust for purchased inputs. Value added subtracts the pur-
chased inputs to focus on the roles of labor, capital, and technology within the
entity itself.
For most goods, labor is the single largest factor of production as measured
by relative expenditures. Labor productivity is thus a commonly used measure.
Labor productivity, however, is a partial productivity measure that does not
distinguish between improvements in technology and the contributions of other
productive factors. Thus, a measure of labor productivity based on gross output
might rise due to the outsourcing of some activities or the improvement of capital
used in the production process. In this instance, labor productivity would rise at
the expense of additional purchased services or other inputs.
Conceptually, a multi-factor productivity measure based on gross output and
changes in the volumes of all individual inputs provides greater insight into the
drivers of output growth. It shows how much of an industry's or firm's output
growth can be explained by the combined changes in all its inputs. Relative to
labor productivity, construction of a multi-factor productivity measure imposes
substantially greater requirements on data and estimation methods.
The construction of productivity measures requires quantitative estimates
of the volume of outputs and inputs, excluding the effects of pure price changes
while capturing improvements in quality. As a simple illustration, total revenues
provide a measure of the value to consumers of an industry's production, and the
revenues of the individual types of good or services produced by the industry are
deflated by price indexes and weighted together by their shares in total revenues
to construct an index of output volume. Similar indexes are constructed for the
volumes of the inputs. The volume indexes of the inputs are combined using as
weights their shares in total income or costs. Alternatively, when feasible, quanti-
ties of outputs and inputs may be estimated without deflating expenditure totals
when the physical units can be counted directly. The productivity measure is
then obtained by dividing the index of output by the composite index of inputs.
After decades of discussion, research and debate, the concepts and methods
OCR for page 22
22 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
used to compute productivity within the market-based private economy have
achieved widespread agreement and acceptance among economists, policy ana-
lysts, and industry specialists. However, comparable progress has not been made
with respect to the measurement of productivity in education, and higher educa-
tion in particular. Progress also has been slow--although perhaps not quite as
slow--in a few other service sector industries, such as finance and health care,
where outputs are also difficult to define and measure (Triplett and Bosworth,
2002). It is possible to count and assign value to goods such as cars and carrots
because they are tangible and sold in markets; it is harder to tabulate abstractions
like knowledge and health because they are neither tangible nor sold in markets.
Standard methods for measuring productivity were developed for profit- or
shareholder value-maximizing firms engaged in the production of tangible goods.
These methods may not be applicable, valid, or accurate for higher education,
which is a very different enterprise. Traditional private and public colleges and
universities are not motivated by or rewarded by a profit margin. Neither their
output nor their prices are determined within a fully competitive market, and thus
their revenues or prices (essentially tuition) are not indicative of the value of the
industry's output to society.3 The inputs to education are substantially similar to
those of other productive sectors: labor, capital, and purchased inputs. Higher
education is distinct, however, in the nature of its outputs and their prices. The
student arrives at a university with some knowledge and capacities that are en-
hanced on the way to graduation. In this instance, the consumer collaborates in
producing the product.
Second, institutions of higher education are typically multi-product firms,
producing a mixture of instructional programs and research as well as entertain-
ment, medical care, community services, and so on. For market-based enterprises,
the production of multiple products raises manageable estimation problems. Out-
puts are combined on the basis of their relative contributions to revenue shares.
This is a common feature of productivity analysis.
However, because research and classroom instruction are both nonmarket
activities, there is no equivalent concept of revenue shares to combine the two
functions. This greatly complicates the analysis and the possibility of deriving an
overall valuation of an institution's output. We have chosen to separate instruc-
tion from research (and other outputs), acknowledging the practical reality that
an institution's allocation of resources among its multiple functions is in part the
result of forces and influences that are quite complex. For example, the value of
research universities cannot be fully measured by their instructional contribution
alone. Important interactions exist, both positive and negative, between research
activities and the productivity of undergraduate instruction. On the positive side,
3Discounts for financial aid also complicate an institution's value function. The interaction between
institutional and consumer value upsets the association of price with value to consumers. Other price
distortions are discussed throughout the report.
OCR for page 23
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 23
there is the opportunity for promising undergraduates to work alongside expe-
rienced faculty. On the negative side, there is the possibility that the growth of
graduate programs detracts from commitments to undergraduate education. While
these difficulties in allocating some inputs and outputs by function are very real,
and clearly warrant investigation, a separate analyses of instruction and research
seems most practical for now and is the approach pursued in Chapter 4.
In Chapter 3, we explore in more detail this and other complexities of mea-
suring productivity in higher education.
2.1.1. Outputs
Estimating productivity presumes an ability to define and measure an indus-
try's (or institution's, or nation's) output. Most systems of output measurement are
estimated by deflating the revenues of individual product categories by indexes of
price change. As noted above, if the products are sold in open competitive mar-
kets, producers will expand output to the point where the marginal revenues of
individual products are roughly equal to their marginal costs. Thus, their revenue
shares can be used as measures of relative consumer value or weights to combine
the various product categories to yield an index of overall output. By focusing
on the price trends for identical models, or by making adjustments to account for
changing characteristics of products, price indexes can differentiate the price and
quality change components of observed changes in overall prices. 4
In some cases, the output of an industry might be based on physical or vol-
ume indicators such as ton miles moved by trucks or, in the case of education,
the number of students in a course. Physical measures of output are difficult to
aggregate if they differ in their basic measurement units, but methods have been
devised to mitigate this problem.5 A greater challenge is that physical indicators
generally miss quality change. While explicit quality adjustments can be included
in the construction of a physical output index, it is difficult to know the weight
to place on changes in quantities versus changes in quality. The role of quality
change and other complications in defining and measuring output of higher edu-
cation are addressed in detail in Chapters 3 and 4.
Higher education qualifies graduates for jobs or additional training as well
as increasing their knowledge and analytic capacities. These benefits of under-
graduate, graduate and professional education manifest as direct income effects,
increased social mobility, and health and other indirect effects. Measures have
been created to monitor changes in these outputs, narrowly defined: numbers
4The complete separation of price and quality change continues to be a major challenge for the
creation of price and output indexes. It is difficult to incorporate new products into the price indexes
in a timely fashion, and price and quality changes are often intermingled in the introduction of new
models. See National Research Council (2002).
5The Törnqvist index used in Chapter 4 uses percentage changes, which takes care of the
dimensionality problem.
OCR for page 24
24 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
of degrees, time to degree, degree mix, and the like. Attempts have also been
made to estimate the benefits of education using broader concepts such as the
accumulation of human capital. For estimating the economic returns to educa-
tion, a starting point is to examine income differentials across educational at-
tainment categories and institution types, attempting to correct for other student
characteristics. Researchers since at least Griliches (1977), Griliches and Mason
(1972), and Weisbrod and Karpoff (1968) have estimated the returns to educa-
tion, controlling for students' cognitive ability by including test score variables
in their wage regressions.
Researchers have also examined the impact of socioeconomic status (SES)
variables on the returns to education, but the results are somewhat ambiguous.
Carneiro, Heckman, and Vytlacil (2010) show that marginal returns to more
college degrees are lower than average returns due to selection bias. That is, re-
turns are higher for individuals with characteristics making them more likely to
attend college than for those for whom the decision is less clear or predictable.
This carries obvious implications for policies designed to increase the number
of college degrees produced by a system, region, or nation. On the other hand,
Brand and Xie (2010) predict that the marginal earnings increase attributable to
holding a degree is actually higher for those born into low socioeconomic status
(relative to the higher SES group more likely to select into college) because of
their lower initial earnings potential with or without a degree. Dale and Krueger
(2002:1491) also found that the "payoff to attending an elite college appears to be
greater for students from more disadvantaged family backgrounds." Davies and
Guppy (1997) found that socioeconomic factors do not affect chances of entry
into lucrative fields net of other background factors, but SES predicts entry into
selective colleges and lucrative fields within selective colleges. Establishing the
values of degrees generally or of degrees in specific fields--as done by Carnevale,
Smith, and Strohl (2010) and Trent and Medsker (1968)--involves estimating the
discounted career cost (controlling for selection effects) of not attending college
at all. To some extent this line of research has been stunted by the characteristics
of available data; many cohort studies have been flawed in not properly including
aging effects, not asking about attainment, or not extending for a long enough
time period. Such features are important for estimating returns.6 As a result, the
evidence for evaluating the magnitude of differences in outcomes of those who
attain higher education and those who do not is surprisingly mixed.
One limitation of the above-described approaches is that the rate of return
on various degrees, and college in general, varies over time with labor market
6For example, the National Center for Education Statistics (NCES) "High School and Beyond"
study looked at a large number of students (more than 30,000 sophomores and 28,000 seniors enrolled
in 1,015 public and private high schools across the country participated in the base year survey), but
did not follow individuals long enough. The NCES National Education Longitudinal Study (NELS)
repeated the error by not asking about final degree attainment; the Education Longitudinal Study
(ELS) is still following cohorts and may offer a very useful data source.
OCR for page 25
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 25
conditions, independent of the quality of the degree or credits earned. Adding to
the supply of graduates tends to lead to a reduction in the wage gap between more
and less educated workers, an effect that may be strengthened if the expansion
causes educational quality to fall. Another modeling consideration is that, due to
market pressures, students may enroll in majors with high projected returns. An
increased supply of graduates in those fields should lead to downward pressure on
wages. The ebb and flow in the demand for and starting wages paid to nurses is a
good example.7 Nonetheless, wages do provide at least one quantifiable measure,
but one that needs regular updating.
Even when research on wages relative to educational attainment is conducted
properly, it cannot tell the whole story. The overall returns to instruction (learn-
ing) and production of degrees are broader than just the pecuniary benefits that
accrue to the degreed individuals. It is a mistake to view the purpose of higher
education as solely to increase gross domestic product (GDP) and individual in-
comes.8 Some of the nation's most important social needs (e.g., teaching, nursing)
are in fields that are relatively low paying. When the focus is on incomes after
graduation, a system or institution that produces more credentialed individuals
in these socially important but low-paying fields will appear less productive than
an institution that produces many highly paid business majors. This would be a
false conclusion. Moreover, using lifetime earnings as a measure of productivity
and then tying public support for institutions to this measure in effect restricts
the educational and career choices of individuals who, capable of entering either,
knowingly choose lower paying over higher paying occupations. In Chapter 3,
we examine the implications of looking more broadly at the benefits--private
and public, market and nonmarket, productive and consumption--produced by
higher education.
2.1.2. Inputs
Having established that productivity relates the quantity of output to the
inputs required to produce it, it is evident that correct measurement requires iden-
tifying all inputs and outputs in the production process. Economists frequently
categorize inputs into the factors of production:
ˇ Labor (e.g., professors, administrators)
ˇ Physical and financial capital (e.g., university buildings, endowments)
ˇ Energy (utilities)
ˇ Materials (e.g., paper, pens, computers if not capitalized)
7Data on graduates' wages do allow students to make informed decisions, so would be useful to
students as a resource, and to administrators for resource allocation.
8That said, one of the most carefully studied "externalities" of higher education is its role in
economic growth--see Card (1999) and Hanushek and Kimko (2000).
OCR for page 26
26 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
ˇ Service inputs (e.g., use of outside payroll, accounting, or information
technology [IT] firms)
Here, we review the role of each of these inputs.
Labor Inputs
In most simple measures of labor productivity, the quantity of labor is de-
fined by the number of hours or full-time equivalent workers. Left at this, the
measure suffers from the assumption that all workers have the same skills and
are paid equivalent wages. This is clearly not true, and can only be maintained
in situations where changes and variation in the skill level of the workforce are
known to be small.
One means of adjusting for quality is to disaggregate the workforce by vari-
ous characteristics, such as age or experience, education, occupation, and gender.
In competitive labor markets, it is assumed that workers of each skill charac-
teristic will be hired up to the point where their wage equals their contribution
to marginal revenue. The price of labor is measured by compensation per hour;
hence, labor inputs of different quality are aggregated using as weights their rela-
tive wage rates or, alternatively, using the share of each type of labor in total labor
compensation. In this respect, the aggregation of the labor input is comparable to
the aggregation of individual product lines to arrive at an estimate of total output.
Relative to other sectors, the problem of measuring labor inputs differs only
marginally for higher education since, even if higher education is largely a non-
market activity, its workforce must be drawn from a competitive market in which
faculty and other employees have a range of alternatives. Some faculty members
are protected by tenure; however, similar issues of seniority and job protection
arise in other industries and the differences are generally ones of degree.9 Despite
these similarities, however, it may be desirable to differentiate among the labor
categories of teachers as discussed in Chapters 3 and 4.
Another complication arises at research-based institutions. For these institu-
tions, the time and cost of faculty and administrative personnel must be divided
between research and instruction.10 One approach might rely on time-use studies
to develop general guidelines on the number of instructional hours that accom-
pany an hour of classroom time, although time required for student consultations
and grading may vary with class size. Furthermore, there are so many different
kinds of classes and teaching methods that it is not practical to associate hours for
9The role of tenure in education is often comparable to various union pressures for seniority and
other forms of job protection.
10Some problems of using wage rates to adjust for the quality of faculty teaching may arise in
research-based institutions where the primary criteria for promotion and tenure reflect research rather
than teaching skills.
OCR for page 27
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 27
BOX 2.1
A Note on Student Time Inputs
A fully specified production function for higher education might include stu-
dent time as an input. Given the panel's charge, this area of measurement is not
a high priority; however, the student time input, if defined as the number of hours
spent in school-related activities multiplied by an opportunity cost wage rate,
would be substantial (see National Research Council, 2005, for a discussion
of how to deal with nonmarket time valuations within an economic accounting
framework). It can be difficult to establish opportunity cost wages when students
are subsidized. For example, during periods or in places characterized by high
unemployment, a federal Pell grant is a good substitute for a job.
For our purposes, we acknowledge that unpaid student time is a relevant
input to the production function (though Babcock and Marks, 2011, find students
are studying less). Nonetheless, little would be gained for policy purposes by in-
cluding it in productivity measures. For applications where this kind of information
is important, researchers can turn to the Bureau of Labor Statistics' American
Time Use Survey, which includes data on study time by students.
specific labor categories with credit hours, even if variation in class size could be
handled. As discussed below, we therefore believe the best approach is to allocate
inputs among output categories using a more aggregate approach.
Finally, the student's own time and effort is a significant input to the educa-
tional process (see Box 2.1). While there has been debate about whether student
effort should be treated as an input or an output, the emergent field of service sci-
ence moots the question by recognizing that the process of consuming any service
(including education) requires the recipient to interact with the provider during
the production process and not only after the process has been completed as in
the production of goods.11 This phenomenon is called coproduction. As applied
to higher education, it means student effort is both an input and an output. This
is consistent with the view that a primary objective of a university is to encour-
age strong engagement of students in their own education. Equally fundamental,
institutions of higher education service a highly diverse student population, and
many institutions and programs within those institutions have devoted great effort
to sorting students by ability. In the absence of information about the aptitude lev-
els of incoming students, comparing outcomes across institutions and programs
may not provide a useful indication of performance.
11See, for example, Sampson in Maglio, Kieliszewski, and Spohrer (2010:112).
OCR for page 28
28 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
Capital Inputs
The major feature of capital is that it is durable and generates a stream or
flow of services over an extended period. Thus, the contribution of capital to
production is best measured as a service or rental flow (the cost of using it for
one period) and not by its purchase price. Because many forms of capital cannot
be rented for a single production period, the rental or service price must be im-
puted. This is done by assuming that a unit of capital must earn enough to cover
its depreciation and a real rate of return comparable to similar investments. The
depreciation rate is inversely proportionate to the asset's expected useful life,
and the rate of return is normally constant across different types of capital.12
Short-lived capital assets can be expected to have a higher rental or service price
because their cost must be recovered in a shorter period. These rental rates are
comparable to a wage rate and can be used in the same way to aggregate across
different types of capital services and as a measure of capital income in aggregat-
ing the various inputs to production.
The role of capital in the measurement of productivity in higher education
is virtually identical to that for a profit-making enterprise. Assets are either pur-
chased in markets or valued in a fashion similar to that in the for-profit sector.
Thus, the standard measurement of capital services should be appropriate for
higher education. The education sector may exhibit a particular emphasis on
information and communications capital because of the potential to use such
tools to redesign the education process and by doing so to achieve significant
productivity gains. The more significant problem at the industry level is that there
is very little information on the purchases and use of capital in higher education.
The sector is exempt from the economic census of the U.S. Census Bureau, which
is the primary source of information for other industries. However, the Internal
Revenue Service Form 990 returns filed by nonprofit organizations do contain
substantial financial information for these organizations, including data on capital
expenditures and depreciation.
Energy, Materials, and Other Purchased Inputs
Productivity measures require information on intermediate inputs either as
one of the inputs to the calculation of multi-factor productivity or as a building
block in the measurement of value added. In some measures, energy, materials,
and services are identified separately. Such a disaggregation is particularly use-
ful in the calculation of meaningful price indexes for purchased materials. In the
past, the lack of significant information on the composition of intermediate inputs
was a significant barrier to the calculation of productivity measures for many
12The rental rate is measured as a proportion of the replacement cost of a unit of capital or Pk
(r + d), where Pk is the replacement cost, r is the real rate of return, and d is the depreciation rate.
OCR for page 29
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 29
service industries. Lack of relevant information on purchased inputs continues to
be a major shortfall for estimating productivity in higher education. This kind of
data is particularly important for analyses attempting to control for the effects of
the outsourcing of some service activities. As with capital, the primary problem
in measuring the role of purchased inputs in higher education is the lack of a
consistent reporting system. The information is known at the level of individual
institutions, but there is no system for collecting and aggregating the data at the
national level for the purpose of establishing performance norms.
2.1.3. Instructional and Noninstructional Elements of the
Higher Education Production Function
For the purposes of this report, it is essential to distinguish inputs and out-
puts along functional lines. In particular, an effort should be made to identify the
inputs that go into each of the multiple outputs produced by the sector. These
inputs can be designated:
ˇ Instructional, including regular faculty, adjunct faculty, and graduate
student instructors.
ˇ Noninstructional and indirect costs including, for example, administra-
tion, athletics, entertainment, student amenities, services, hospital opera-
tion, research and development, student housing, transportation, etc. 13
Some of these are budgeted separately.
ˇ Mixed, including other capital such as instructional facilities, laboratory
space and equipment, and IT. The best way to distribute the cost of such
inputs across instructional, administrative, and research categories is not
often clear.
In the model presented in Chapter 4, we attempt to identify all the inputs
associated with the instruction function, while recognizing the difficulty of sepa-
rating instructional and noninstructional costs or inputs. The main concern is to
distinguish inputs associated with instruction from those designated for research.
As faculty are involved in a range of activities, it is difficult to assign their wages
to one category or another.
Instructional costs can also vary greatly. On the faculty side, per unit
(e.g., course taught) instructional costs vary by field, institution, and type of
instructor. On the student side, per-unit instructional costs vary by student
level--undergraduate, taught postgraduate, and research students; mode of at-
tendance--full- versus part-time students (the cost of student services varies by
13See Webber and Ehrenberg (2010).
OCR for page 30
30 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
mode of attendance even if the teaching cost per credit hour does not14); and
field of study, with business and the humanities costing less than science and
engineering, which in turn cost less than medicine. At the institutional level,
costs can be subject to large-scale activity-based costing studies. Costs can also
be disaggregated to the department level. Because the panel's interests and charge
focus primarily on groups of institutions of different types and within different
states, our recommendations do not emphasize detailed breakdowns of costs at
the student level. Nevertheless, some way of controlling for these variations will
be essential to ameliorate significant distortions and criticisms.
For administrative and other purposes, universities typically track inputs
along other dimensions, such as by revenue source. For our purposes, the only
reason for classifying inputs according to revenue source is to separate the inputs
associated with organized research and public service as described in Chapter
4. University accounting systems assign costs to funds. This practice tends to
differentiate among payers, but obfuscates productivity unless specific outputs
also are assigned to the fund. Differentiating inputs among payers departs from
the idea of productivity as an engineering concept relating physical inputs and
outputs. Further, not all revenues are fungible; they cannot all be used to increase
production of undergraduate degrees (Nerlove, 1972).
Higher education costs may also be identified and categorized according to
their source:
ˇ institutional funds such as gifts and endowments;
ˇ public-sector appropriations, including state, local, and federal govern-
ment subsidies and financial aid;
ˇ tuition and fees from students and their families (and note that some fac-
tors affect costs to specific payers but not overall cost; cost to university
may also differ from total cost); and
ˇ sponsored research.
For some policy purposes it is important to distinguish between trends in tuition
and trends in cost per full-time equivalent (FTE) student. Some analyses dispute
the common notion that the cost of higher education is rising faster than consumer
prices broadly; rather, the composition of who pays is changing. Even when the
total cost of a college education is relatively stable, shifts occur in the proportions
paid by different players and what activities the revenues support.
McPherson and Shulenburger (2010) highlight the important difference be-
tween cost and price. In simple economics terms, the cost, or supply schedule, is
based on an underlying production function. Productivity improvements shift the
14 Mode of attendance may affect the relationship between transcript and catalog cost measures. For
example, part-time students may take more courses or repeat courses because of scheduling problems
or less efficient sequencing and thus learning (Nerlove, 1972).
OCR for page 31
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 31
cost schedule downward, with (other things being equal) attendant reductions in
price and increases in quantity demanded for a given demand schedule. The full
price of undergraduate education (determined by both the demand and supply
functions) is the sum of tuition charges, campus subsidies, and state subsidies.
Affordability and access thus depend on state appropriations as much as they
depend on changes in productivity. For example, if an increase in productivity
occurs simultaneously with a reduction in state appropriations, price to student
(tuition) may not fall; it may even rise depending on relative magnitudes.
In the same vein, it is important to highlight differences between public and
private higher education, as has been done by the Delta Cost Project (2009). Tu-
ition increases in private higher education invariably are associated with increased
expenditures per student.15 In marked contrast, tuition increases in public higher
education often are associated with decreases in expenditures per student as the
tuition increases often only partially offset cutbacks in state support.
2.2. PRODUCTIVITY CONTRASTED WITH
OTHER MEASUREMENT OBJECTIVES
Dozens of metrics have been created to serve as proxies for productivity or as
indicators to inform accountability programs and to track costs and outcomes. 16
Beyond productivity as defined above, measures of efficiency and cost are other
performance metrics with policy value. While there are certainly appropriate uses
for a variety of measures, there are also dangers of misuse, such as the creation
of perverse incentives. For example, if degrees granted per freshman enrolled was
used to track performance, then institutions could enroll large numbers of transfer
students to improve their standing. Our review of various measures below informs
our recommendations for developing new measures and for modifying existing
ones. New, improved, and properly applied performance measures will begin
filling information gaps and allow multiple stakeholders to better understand
performance trends in higher education.
2.2.1. Productivity and Cost
An alternative approach to measuring productivity--one typically used in
cost studies--is to estimate the expenditures incurred for instructional activity
15Pell grants, state need-based scholarships, and other sources of student aid can, in principle,
offset tuition hikes.
16See Measuring Quality in Higher Education (http://applications.airweb.org/surveys/ Organization.
aspx [February 2012]), a database developed by the National Institute for Learning Outcomes
Assess ment, which describes four categories: assessment instruments; software tools and platforms;
benchmarking systems and other extant data resources; and assessment initiatives, collaborations, and
custom services. The database can be searched by unit of analysis and aggregation level. These are
categorized not too differently from our matrix (i.e., student, course, institution, and state or system).
OCR for page 32
32 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
(including allocated overheads), then divide by a volume measure of output to
produce a ratio such as cost per degree. Under tightly specified conditions, this
would produce the same result as a productivity measure. These conditions,
however, are rarely if ever realized. The problem is that simple ratios like cost
per student or degree does not take into consideration quality and the multiple
outputs produced by higher education institutions. Hence, this approach conveys
too little information to be able to attribute productivity differences to differences
(over time or between institutions) in price and quality.
Efficiency is improved when cheaper inputs are substituted for more expen-
sive ones without damaging quality proportionately. For example, it has become
a common trend for institutions to substitute adjunct instructors for tenure-track
faculty. Whether this move toward lower-priced inputs has a proportionately
negative impact on output quantity and quality (e.g., numbers of degrees and
amount learned) is not yet fully known, and surely varies from situation to situa-
tion (e.g., introduction and survey classes versus advanced seminars). In review-
ing evidence from the emerging literature, Ehrenberg (2012:200-201) concludes
that, in a wide variety of circumstances, the substitution of adjuncts and full-time
nontenure-track faculty for tenure-track faculty has resulted in a decline in per-
sistence and graduation rates.
Without data tying changes in faculty composition to student outcomes,
efforts to implement accountability systems will be made with only partial in-
formation and will lead to problematic policy conclusions. For example, in 2010
the office of the chancellor of Texas A&M University published what amounted
to a "a profit-and-loss statement for each faculty member, weighing annual sal-
ary against students taught, tuition generated, and research grants obtained ...
the number of classes that they teach, the tuition that they bring in and research
grants that they generate" (Wall Street Journal, October 22, 2010). When a met-
ric as simple as faculty salary divided by the number of students taught is used,
many relevant factors are omitted. An instructor teaching large survey courses
will always come out ahead of instructors who must teach small upper-level
courses or who are using a year to establish a laboratory and apply for grants,
as is the case in many scientific disciplines.17 These metrics do not account for
systematic and sometimes necessary variations in the way courses at different
levels and in different disciplines are taught; and they certainly do not account
for differences in the educational experience across faculty members and across
different course designs.
The value of productivity and efficiency analysis for planning purposes is
that it keeps a focus on both the input and output sides of the process in a way
17In recognition of these limitations, administrators did pull the report from a public website to
review the data and the university president promised faculty that the data would not be used to "as-
sess the overall productivity" of individual faculty members (see http://online.wsj.com/article/SB10
001424052748703735804575536322093520994.html [June 2012]).
OCR for page 33
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 33
that potentially creates a more thorough and balanced accounting framework. If
costs were the only concern, the obvious solution would be to substitute cheap
teachers for expensive ones, to increase class sizes, and to eliminate departments
that serve small numbers of students unless they offset their boutique major with
a substantial grant-generating enterprise.18 Valid productivity and efficiency mea-
sures needed for accountability require integration of additional information--for
example, the extent to which use of nontenure track faculty affects learning, pass
rates, and preparation for later courses relative to the use of expensive tenured
professors. The implication is that analysts should be concerned about quality
when analyzing statistics that purport to measure productivity and efficiency.
Different input-output ratios and unit costs at differing quality levels simply are
not comparable.
Finally, it is important to remember that even valid measures of cost and
productivity are designed to answer different questions. A productivity metric, for
example, is needed to assess whether changes in production methods are enabling
more quality-adjusted output to be generated per quality-adjusted unit of input.
That this is an important question can be seen by asking whether higher educa-
tion is indeed subject to Baumol's cost disease (see Chapter 1)--the question
of whether, in the long run, it is a "stagnant industry" where new technologies
cannot be substituted for increasingly expensive labor inputs to gain efficiencies.
Unit cost data cannot answer this question directly, but they are needed for other
purposes, such as when legislatures attempt to invest incremental resources in
different types of institutions to get the most return in terms of numbers of de-
grees or graduation rates. This kind of resource-based short-run decision making
responds to funding issues and institutional accountability, but addresses produc-
tivity only indirectly and inadequately.
A critical asymmetry also exists in the way productivity and cost-based
measures are constructed. Current period price data can be combined with the
physical (quantity) data to calculate unit costs, but it is impossible to unpack the
unit cost data to obtain productivity measures. The fact that most measurement
effort in higher education is aimed at the generation of unit cost data has inhibited
the sector's ability to assess and improve its productivity.
2.2.2. Other Performance Metrics
Many other performance measures have been proposed for higher education.
The most prominent of these are graduation rates, completion and enrollment
ratios, time to degree, costs per credit or degree, and student-faculty ratios. These
kinds of metrics are undeniably useful for certain purposes and if applied cor-
rectly. For example, Turner (2004) uses time-to-degree data to demonstrate the
18To the credit of Texas A&M University, it did not respond to the findings of its faculty assessment
in any of the above-mentioned ways.
OCR for page 34
34 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
relative impact on student outcomes of changing incoming student credentials
versus effectiveness in the allocation of resources within public higher education.
She finds that the former has a smaller impact than the latter. Similarly, studies
have usefully shown how tuition and aid policies affect student performance
as measured partially by these statistics. The range of performance metrics,
including a discussion of the meaning of graduation rates as calculated by the
Integrated Postsecondary Education Data System (IPEDS), is described in detail
in Appendix A.
While their role is accepted, the measures identified above should not be
confused with productivity as defined in this report. Used as accountability tools,
one-dimensional measures such as graduation rates and time-to-degree statistics
can be abused to support misleading conclusions (e.g., in making comparisons
between institutions with very different missions). Also, because graduation rates
are strongly affected by incoming student ability, using them in a high-stakes
context may induce institutions to abandon an assigned and appropriate mis-
sion of broad access. Use of these kinds of ratio measures may similarly induce
institutions to enroll large numbers of transfer students who are much closer to
earning a degree than are students entering college for the first time, whether that
is the supposed mission or not.
To illustrate the ambiguity created by various metrics, student-faculty ratio
levels can be linked to any combination of the following outcomes:
Low Student-Faculty RatioHigh Student-Faculty Ratio
low productivity high productivity
high quality low quality
high research low research
resource diversion unsustainable workload
The ability to distinguish among these outcomes is crucial both for interpret-
ing student-faculty ratios and for policy making (both inside and outside an
institution).
Time to degree, graduation rate, and similar statistics can be improved and
their misuse reduced when institutional heterogeneity--the mix of full- and
part-time students, the numbers of students who enter at times other than the
fall semester, and the proportion of transfer students--is taken into account.
Additional refinements involve things like adjusting for systemic time-frame
differences among classes of institutions or students. A ratio measure such as a
graduation rates that is lagged (allowing for longer time periods to completion)
is an example. To avoid the kinds of overly simple comparisons that lead to mis-
guided conclusions--or, worse, actions--responsible use of performance metrics
(including productivity if it is used for such purposes) should at the very least
be used only to compare outcomes among like types of institutions or a given
institution's actual performance with expected levels. The institutional segmenta-
OCR for page 35
DEFINING PRODUCTIVITY IN HIGHER EDUCATION 35
tion approach has been used by College Results Online,19 a Web site that allows
users to view graduation rates for peer institutions with similar characteristics
and student profiles. The second method is exemplified by Oklahoma's "Brain
Gain"20 performance funding approach that rewards institutions for exceeding
expected graduation rates. These existing measures and programs with good track
records could serve as models or pilots for other institutions, systems, or states.
While the panel is in no way attempting to design an accountability system,
it is still important to think about incentives that measures create. Because in-
stitutional behavior is dynamic and directly related to the incentives embedded
within the measurement system, it is important to (1) ensure that the incentives
in the measurement system genuinely support the behaviors that society wants
from higher education institutions, and (2) attempt to maximize the likelihood
that measured performance is the result of authentic success rather than manipu-
lative behaviors.
The evidence of distortionary and productive roles of school accountability
is fairly extensive in K-12 education research, and there may be parallel les-
sons for higher education.21 Numerous studies have found that the incentives
introduced by the No Child Left Behind Act of 2001 (P.L. 107-110) lead to
substantial gains in at least some subjects (Ballou and Springer, 2008; Ladd and
Lauen, 2010; Reback, Rockoff, and Schwartz, 2011; Wong, Cook, and Steiner,
2010), and others have found that accountability systems implemented by states
and localities also improve average student test performance (Chakrabarti, 2007;
Chiang, 2009; Figlio and Rouse, 2006; Hanushek and Raymond, 2004; Neal and
Schanzenbach, 2010; Rockoff and Turner, 2010; Rouse et al., 2007). However,
these findings have been treated with some skepticism because, while Rouse
and colleagues (2007) show that schools respond to accountability pressures in
productive ways, there is also evidence that schools respond in ways that do not
lead to generalized improvements. For example, many quantitative and qualitative
studies indicate that schools respond to accountability systems by differentially
allocating resources to the subjects and students most central to their account-
ability ratings. These authors (e.g., Booher-Jennings, 2005; Hamilton et al., 2007;
Haney, 2000; Krieg, 2008; Neal and Schanzenbach, 2010; Ozek, 2010; Reback,
Rockoff, and Schwartz, 2011; White and Rosenbaum, 2008) indicate that schools
under account ability pressure focus their attention more on high-stakes subjects,
teach skills that are valuable for the high-stakes test but less so for other assess-
ments, and concentrate their attention on students most likely to help them satisfy
the accountability requirements.
Schools may attempt to artificially boost standardized test scores (Figlio and
Winicki, 2005) or even manipulate test scores through outright cheating (Jacob
19See http://www.collegeresults.org/ [June 2012].
20See http://www.okhighered.org/studies-reports/brain-gain/ [June 2012].
21The panel thanks an anonymous reviewer for the following discussion of incentive effects associ-
ated with accountability initiatives in the K-12 context.
OCR for page 36
36 IMPROVING MEASUREMENT OF PRODUCTIVITY IN HIGHER EDUCATION
and Levitt, 2003). These types of behaviors may be the reason that the recent
National Research Council (2011) panel on school accountability expressed a
skeptical view about accountability while recognizing the positive gains associ-
ated with these policies.
One potential solution emerging from the K-12 literature is that "value
added" measures of outcomes tend to be less manipulable than are measures
based on average levels of performance or proficiency counts. The rationale
is that when schools are evaluated based on their gains from year to year, any
behaviors generating artificial improvements would need to be accelerated in
order for the school to continue to show gains the next year. In higher education,
however, this year's post-test is not next year's pre-test, so there remains the
very real possibility that institutions could manipulate their outcomes (or their
inputs) in order to look better according to the accountability system; and while
value added measures might allow for more apples-to-apples comparisons among
institutions, they will not reduce the strategic behavior problem by as much as
they might in K-12 education.
One example of how higher education institutions respond strategically to the
incentives embedded within an evaluation system is observable in relation to the
U.S. News & World Report rankings. Grewal, Dearden, and Lilien (2008) docu-
ment ways in which universities strategically deploy resources in an attempt to
maximize their rankings. Avery, Fairbanks, and Zeckhauser (2003) and Ehrenberg
and Monks (1999) find that the ranking system distorts university admissions and
financial aid decisions.
If institutions make failure more difficult by implementing systems of sup-
port to help struggling students improve, this is a desired outcome of the ac-
countability system. If instead they act in ways that dilute a curriculum, or select
students who are likely to help improve the institution's ranking, this could be
a counterproductive consequence of the system. The more background charac-
teristics are used to predict graduation rates, the harder this manipulation would
become, but on the other hand, only a small number of background factors are
currently available on a large scale.
To sum up, many proxy measures of productivity have been constructed over
the years. They have some utility in comparing institutions and programs, if used
cautiously and with knowledge of their drawbacks. But experience has shown that
they can result in major misunderstandings and the creation of perverse incentives
if applied indiscriminately. As with productivity measurement itself, these proxies
are significantly affected by context. Among the most important contextual vari-
ables that must be controlled for are institutional selectivity, program mix, size,
and student demographics. The model outlined in Chapter 4 suggests approaches
for dealing with some of the shortcomings of traditionally used performance mea-
sures. Part-time students are treated as partial FTEs; semester of entry does not
create distortions; and successful transfers are accounted for through awarding
bonus points analogous to the sheepskin effect for bachelor's degrees.