Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 68
5
Why Redesign the CE?
T
oday, all household surveys, including the Consumer Expenditure
Surveys (CE), face well-known challenges. These challenges include
maintaining adequate response from increasingly busy and reluctant
respondents. In addition, more and more households are non-English speak-
ing, and a growing number of higher-income households have controlled-
access residences. Call screening and cell-phone-only households have made
telephone contacts on surveys more difficult. Today’s household surveys
face confidentiality and privacy concerns, a public growing more suspicious
of its government, and competition from an increasing number of private as
well as government surveys vying for the public’s attention (Groves, 2006;
Groves and Couper, 1998).
In the midst of these challenges for household surveys, the CE surveys
stand out as particularly long and arduous. In the Interview survey, recall of
many types of expenditures is likely to be imperfect. A typical respondent
lacks records or at least the motivation to use them in answering the CE
questions. The level of detail that is required in describing each purchase is
daunting. In the Diary survey, respondents are asked to record the details of
many small purchases in a complicated booklet. These demands can easily
result in limited compliance and the omission of expenditures.
Further exacerbating the problem, the CE faces the additional challenge
that consumer spending has changed dramatically over the past 30 years,
and it continues to change (Fox and Sethuraman, 2006; Kaufman, 2007;
Sampson, 2008). When the CE was designed in the 1970s, there was no
online shopping or options for electronic banking and bill paying. Over that
68
OCR for page 69
WHY REDESIGN THE CE? 69
time, shopping patterns have shifted from individual purchases at a variety
of neighborhood stores to collective purchasing at “big box” stores such as
Walmart, Target, and Costco that sell everything from meat to shirts, furni-
ture, and motor oil under one roof. The CE surveys are cognitively designed
to collect spending information based on the 1970s world of purchasing
behaviors, and today’s consumers are unlikely to relate to that.
Underreporting of expenditures is a long-standing problem with the CE
as evidenced by a growing deviation from other data sources and by the re-
sults of several studies. This underreporting appears to differ sharply across
commodities, raising the possibility of differential biases in the Consumer
Price Index (CPI) and the picture of the composition of household spending.
This is the biggest concern with the CE program. The Panel on Redesigning
the BLS Consumer Expenditure Surveys believes that there are a number
of issues with the current design and implementation of the CE, and that
collectively these problems lead to the underreporting of expenditures. This
chapter documents this underreporting and then discusses the issues and
concerns that the panel identified in its study of the CE.
With that said, the panel understands that no survey is perfect. In fact
all surveys are compromises between the need for specific data, the quality
with which those data can be collected, and the burden and costs required
to do so. The CE is no exception. It is the panel’s expectation that by ex-
amining the issues with the current CE along with some alternative designs,
a new and better balance can be found between data requirements, data
quality, and burden.
EVIDENCE OF UNDERREPORTING IN THE CE
In many federal surveys, one can assess the quality of data by compari-
sons with other sources of information. One of the difficulties in evaluating
the quality of CE data is that there is no “gold standard” with which to
compare the estimates. However, several sources provide insight into data
quality. The National Research Council, in its review of the conceptual and
statistical issues with the CPI, expressed concern about potential bias in the
expenditure estimates from the CE. That report recommended comparison
of the CE estimates with those from the Bureau of Economic Analysis’s
Personal Consumption Expenditures (PCE):
The panel’s foremost concern is with the extent of bias in the CEX [Con-
sumer Expenditure Surveys] which, in turn affects the accuracy of CPI
expenditure category budget shares. A starting point for evaluating house-
hold expenditure allocations estimated by the CEX is to compare them
against budget shares generated by other sources. The Bureau of Economic
Analysis (BEA) produces the most obvious alternative, the per-capita and
product accounts (NIPA). (National Research Council, 2002, p. 253)
OCR for page 70
70 MEASURING WHAT WE SPEND
Comparisons Between the CE and PCE
Compatibility
A long literature has focused on the discrepancy between the CE and PCE
data from the National Income and Product Accounts (NIPA) (Attanasio,
Battistin, and Leicester, 2006; Branch, 1994; Garner, McClelland, and
Passero, 2009; Garner et al., 2006; Gieseman, 1987; Meyer and Sullivan,
2011; Slesnick, 1992). However, in comparing the CE to the PCE data, it
is important to recognize conceptual incompatibilities between these data
sources. Slesnick (1992, p. 22), when comparing CE and PCE data from
1960 through 1989, concluded that “approximately one-half of the differ-
ence between aggregate expenditures reported in the CEX [CE] surveys and
the NIPA can be accounted for through definitional differences.” Similarly,
the General Accounting Office (1996, p. 15), now the U.S. Government
Accountability Office, in a summary of a Bureau of Economic Analysis
comparison of the differences in 1992, reported that “more than half was
traceable to definitional differences.”
Thus, a key conceptual difference between the CE and PCE is “what is
measured.” The CE measures out-of-pocket spending by households, while
the PCE definition is wider, including purchases made on behalf of house-
holds by institutions. The CE is not intended to capture purchases by house-
holds abroad such as those on military bases, whereas the PCE includes these
purchases. These differences are important and growing over time. Impu-
tations including those for owner-occupied housing and financial services,
but excluding purchases by nonprofit institutions serving households and
employer contributions for group health insurance, now account for over
10 percent of the PCE. In-kind social benefits account for nearly another 10
percent. Employer contributions for group health insurance and workers’
compensation account for over 6 percent, while life insurance and pension
fund expenses and final consumption expenditures of nonprofits represent
almost 4 percent. McCully (2011) reported that in 2009 nearly 30 percent of
the PCE was out-of-scope for the CE, up from just over 7 percent in 1959.
Another important conceptual difference between the CE and PCE
is the underlying data and how the estimates are constructed. Chapter 3
of this report describes the CE surveys in some detail. In comparison, the
PCE aggregates come from data on the production of goods and services,
rather than consumption or expenditures by households. The PCE depends
on multiple sources, primarily from business records reported on the eco-
nomic censuses and other Census Bureau surveys. The PCE numbers are
the product of substantial estimation and imputation processes that have
their own error profiles. Estimates from these business surveys are adjusted
using input-output tables to add imports and subtract sales that do not go
OCR for page 71
WHY REDESIGN THE CE? 71
to domestic households. These totals are then balanced to control totals for
income earned, retail sales, and other benchmark data (Bureau of Economic
Analysis, 2010, 2011a,b).
One indicator of the potential error in the PCE is the magnitude of
the revisions that are made from time to time (Gieseman, 1987; Slesnick,
1992). A recent example is the 2009 PCE revisions, which substantially
revised past estimates of several categories. Food at home, one of the largest
categories, decreased by over 5 percent after the 2009 revision.1
Some authors have argued that despite the incompatibilities between
the CE and PCE, the differences between the series should be expected to
be relatively constant (Attanasio et al., 2006). While a plausible conclusion,
a gradual widening of the difference between the sources could still be ex-
pected given their growing incompatibility, as reported in McCully (2011)
and Moran and McCully (2001).
Comparisons
Gieseman (1987) conducted one of the first evaluations of the cur-
rent CE, comparing the CE to the PCE for 1980–1984.2 He found that
the CE reports were close to the PCE for rent, fuel and utilities, telephone
services, furniture, transportation, and personal care services. On the other
hand, substantially lower reporting in the CE for food, household furnish-
ings, alcohol, tobacco, clothing, and entertainment was apparent back in
1980–1984.
The current patterns have strong similarities to those from 30 years
ago. Garner et al. (2006) reported a long historical series of comparisons
for the integrated data that begins in 1984 and goes up through 2002. Some
categories compare well. Rent, utilities, and fuels and related items are
reported at high and stable levels relative to the PCE. Telephone services,
vehicle purchases, and gasoline and motor oil are reported at high levels
(compared to the PCE) but have declined somewhat over time. Food at
home relative to the PCE is about 0.70, but has remained stable over time.
The many remaining categories of expenditures are reported at low levels
relative to the PCE, though some small categories such as footwear and
vehicle rentals show relative increases.
1
The 2008 value for food at home was $741,189 (in millions of dollars) prior to revision
and $669,441 after, but the new definition excludes pet food. A comparable pre-revision
number excluding pet food is $707,553. The drop from $707,553 to $669,441 is 5.4 percent.
Appreciation is given to Clinton McCully (BEA) for clarifying this revision.
2
Comparisons of consumer expenditure survey data to national income account data go
back at least to Houthakker and Taylor (1970). The issues were also addressed in a long series
of articles comparing the CPI to the PCE deflators by Bunn and Triplett (1983) and Triplett
and Merchant (1973).
OCR for page 72
72 MEASURING WHAT WE SPEND
Garner et al. (2006) ultimately argued that this comparison should
focus on expenditure categories whose definitions are the most comparable
between the CE and PCE, noting “a more detailed description of the cat-
egories of items from the CE and the PCE is utilized than was used when
the historical comparison methodology was developed. Consequently, more
comparable product categories are constructed and are included in the
final aggregates and ratios used in the new comparison of the two sets of
estimates” (Garner et al., 2006, p. 22). The new series provides compari-
sons every five years from 1992 to 2002 (Garner et al., 2006), and were
updated and extended annually through 2007 in Garner, McClelland, and
Passero (2009).
When using comparable categories and when the PCE aggregates are
adjusted to reflect differences in population coverage between the two
sources, the ratio of total expenditures on the CE to PCE is fairly high but
still decreases over time. The ratio for 1992 and 1997 was 0.88, while in
2002 it was 0.84 and by 2007 had fallen to 0.81 (Garner, McClelland,
and Passero, 2009). Figure 5-1 shows the time pattern for the ratio of CE
to PCE spending for comparable categories over 2003–2009. The above
discussion highlights that it is easy to overstate the discrepancy between
the CE and the PCE by comparing all categories, rather than restricting
the comparison to categories with comparable definitions (Passero, 2011).
Separate Comparison of the Interview Survey Estimates
and the Diary Survey Estimates with the PCE
It is also important to look at comparability with the PCE of estimates
from the Interview survey and Diary survey separately. Gieseman (1987)
reported separate comparisons of the Interview survey and Diary survey
estimates to PCE estimates for food because these were the only estimates
available from both surveys.3 He found that Interview food at home ex-
ceeded Diary food at home by 10 to 20 percentage points, but was still
below the PCE. For what was then a much smaller category, food away
from home, the Diary aggregate exceeded the Interview aggregate by about
20 percentage points. Again, the CE numbers were considerably lower than
the PCE ones.
It is not surprising that the Interview and Diary surveys yield different
estimates, given the different approaches to data collection, including a
3
In these early years, BLS published separate tables for Interview and Diary data. In recent
years, tables have been published with only integrated data. Consequently, subsequent com-
parisons of CE to PCE almost exclusively rely on the integrated data that combine Interview
survey and Diary survey data. In cases where the expenditure category is available in both
surveys, the BLS selects the source for the integrated data that is viewed as most reliable. See
Creech and Steinberg (2011) and Steinberg et al. (2010).
OCR for page 73
WHY REDESIGN THE CE? 73
1
CE to PCE Ratio 0.9
Total
0.8
Durables
0.7 Nondurables
Services
0.6
0.5
2003 2004 2005 2006 2007 2008 2009
Year
FIGURE 5-1 Coverage of comparable spending between the CE and PCE.
NOTE: CE = Consumer Expenditure Surveys, PCE = Personal Consumption
Expenditures.
SOURCE: Passero (2011).
Fig5-1.eps
different form of interaction with the respondent household. These differ-
ences provide the likelihood of differences in estimates between the two
surveys as currently configured, as discussed in more detail later in this
chapter.
Bee, Meyer, and Sullivan (2012) looked further at comparing the es-
timates from both surveys separately to the PCE. The authors examined
estimates for 46 expenditure categories for the period 1986–2010 that are
comparable to the PCE for one or both of the CE surveys. Table 5-1 shows
the 10 largest expenditure categories for which these separate comparisons
can be made, showing ratios of the CE to PCE for these categories. Among
these categories, six (imputed rent on owner-occupied nonfarm housing,
rent and utilities, food at home, gasoline and other energy goods, communi-
cation, and new motor vehicles) are reported on the CE Interview survey at
a high rate (relative to the PCE) and have been roughly constant over time.
These six are all among the eight overall largest expenditure categories.
In 2010, the ratio of CE to PCE exceeded 0.94 for imputed rent, rent and
utilities, and new motor vehicles. It exceeded 0.80 for food at home and
communication and is just below this number for gasoline and other energy
goods. In contrast, no large category of expenditures was reported at a high
rate (relative to the PCE) in the Diary survey that was also higher than the
equivalent rate calculated from the Interview survey. Reporting of rent and
utilities is about 15 percentage points higher in the Interview survey than
the Diary survey. Food at home is about 20 percentage points higher in the
OCR for page 74
74 MEASURING WHAT WE SPEND
TABLE 5-1 CE/PCE Comparisons for the 10 Largest Comparable
Categories, 2010
Ratios
PCE
PCE Category ($ millions) Diary to PCE Interview to PCE
Imputed rental of owner-occupied nonfarm 1,203,053 1.065
housing
Rent and utilities 668,759 0.797 0.946
Food at home 659,382 0.656 0.862
Food away from home 545,579 0.519 0.506
Gasoline and other energy goods 354,117 0.725 0.779
Clothing 256,672 0.487 0.317
Communication 223,385 0.686 0.800
New motor vehicles 178,464 0.961
Furniture and furnishings 140,960 0.433 0.439
Alcoholic beverages purchased for off- 106,649 0.253 0.220
premises consumption
NOTE: CE = Consumer Expenditure Surveys, PCE = Personal Consumption Expenditures.
SOURCE: Bee, Meyer, and Sullivan (2012).
Interview survey.4 Gasoline and other energy goods are about 5 percentage
points higher in the Interview survey and communication is about 10 per-
centage points higher. The 2010 ratios for food away from home and fur-
niture and furnishings are close to a half for both the Interview and Diary
surveys. For clothing and alcohol, the Diary survey ratios are below 0.50,
but the Interview survey ratios are even below those for the Diary survey.
The panel next looked at smaller expenditure categories that are com-
parable between the PCE and the CE. Of the 36 such categories, only six
in the Interview and five in the Diary have a ratio of at least 0.80 in 2010.
In the Diary survey household cleaning products and cable and satellite
television and radio services were reported with a high rate (comparable
to the PCE). Household cleaning products had a ratio (relative to the PCE)
of 1.15 in 2010 in the Diary survey; the ratio has not declined appreciably
in the past 20 years. The largest of these categories reported with a high
rate (comparable to the PCE) in the Interview survey were motor vehicle
accessories and parts, household maintenance, and cable and satellite tele-
vision and radio services. The remaining categories were reported at low
4
There is some disagreement about how to interpret the fact that food at home from the
CE Interview survey compares more favorably to PCE numbers than does food at home from
the CE Diary survey. Some have argued that the CE Interview survey numbers may include
nonfood items purchased at a grocery store. Battistin (2003) argued that the higher reporting
of food at home for the recall questions in the Interview component is due to overreporting,
but Browning, Crossley, and Weber (2003) stated that this is an open question.
OCR for page 75
WHY REDESIGN THE CE? 75
rates (compared to the PCE) in both surveys with ratios below one-half.
These include glassware, tableware, and household utensils and sporting
equipment. Gambling and alcohol had especially low ratios, below 0.20
and 0.33, respectively, in both surveys in most years.
Summary of Comparisons with the PCE
The overall pattern indicates that the estimates for larger items from
the CE are closer to their comparable estimates from the PCE. The current
Interview survey estimates these larger items more closely to the PCE than
does the current Diary survey. For the 36 smaller categories, neither the
Interview survey nor the Diary survey consistently produces estimates that
have a high ratio compared to the PCE. The categories of expenditures
that had a low rate (compared to the PCE) tended to be those that involve
many small and irregular purchases, categories of goods for specific family
members (clothing), and categories for which individuals might want to
underestimate purchases (alcohol, tobacco). Large salient purchases (like
automobiles), and regular purchases (like rent and utilities) for which the
Interview survey was originally designed, seem to be well reported. These
patterns have been largely evident since the 1980s or even earlier. However,
over the past three decades, there has been a slow decline in the level of
reporting of many of the mostly smaller categories of expenditures in both
the Interview survey and the Diary survey.
Similar results are reported from Canada. Statistics Canada’s con-
sumption survey was redesigned with both a recall survey and diary, with
partial implementation in 2009. The level of expenditures from the diary
was found to be 14 percent less than the recall interview for less frequent
expenses and 9 percent less for frequent expenditures. Incomplete diaries
contributed to the underestimation, given that 20 percent of diary days
were “nonresponded” days (Dubreuil et al., 2011).
The panel reiterates that there are many differences between the CE and
the PCE, and it does not consider the PCE to be truth. Nevertheless, the most
extensive benchmarking of the CE is to the PCE, so these results are informa-
tive. Furthermore, when separate comparisons of the Interview survey and
the Diary survey to the PCE are available, the comparisons provide an indi-
cation of the possible degree of relative underreporting in the two surveys.
Comparisons Between the CE and Other Sources
There have been comparisons of the CE to a number of other sources.
Most are summarized on the BLS Comparisons Web page.5 These compari-
5
See http://www.bls.gov/cex/cecomparison.htm.
OCR for page 76
76 MEASURING WHAT WE SPEND
sons include, but are not limited to: utilities compared to the Residential
Energy Consumption Survey (RECS); food at home compared to trade pub-
lications Supermarket Business and Progressive Grocer; and health expen-
ditures compared to the National Health Expenditure Accounts (NHEA)
and the Medical Expenditure Panel Survey (MEPS). Some of the findings
are presented below.
The CE’s estimates for utilities are compared to those generated by the
RECS. The populations of households from these two surveys are not iden-
tical, but fairly consistent. The RECS collects most information on utilities
directly from utility companies after obtaining permission from the sampled
households. Between 2001 and 2005, the CE estimates of total expenditures
for residential energy were between 7 and 9 percent higher than from the
RECS. When the energy source was broken down, the CE was higher for
electricity and natural gas, while lower for the smaller category of fuel oil
and LP gas.
In 2007, the CE’s estimate for total health expenditures was 67 percent
of the total out-of-pocket health expenditures estimated from the NHEA.
The NHEA is based on a broader population definition than is the CE, and
the differences between its estimates and the CE may be affected by the
population differences plus the concepts, context, and scope of data collec-
tion. When compared to the MEPS, the CE estimates were lower for total
health expenditures, with comparison ratios similar as those of the NHEA.
Comparisons were made between total food at home from the CE with
grocery trade association data from Supermarket Business and Progressive
Grocer. During the 1990s, the CE estimate was consistently between 10
percent and 20 percent higher than the trade association data.
Summary of Comparisons with Other Sources
The panel was not charged with evaluating the error structure of the
PCE or other relevant sources of administrative data. However, the above
analysis provides important background for making decisions about the
CE redesign. It indicates that the concerns about underreporting of expen-
ditures in both the CE Diary and CE Interview surveys are warranted. For
many uses of the CE, any underreporting is problematic. However, for the
use in calculating CPI budget shares, the differential underreporting that is
strongly indicated by these results, and discussed in more detail on p. 105
of Chapter 5, “Disproportionate Nonresponse,” is especially problematic.
In principle, an attentive, motivated respondent could report a particular
expenditure—a pound of tomatoes for a certain price—concurrently with
better accuracy than in a recall survey. This potential is not evident from
the estimates of aggregate spending obtained from the current designs of
the CE Interview and Diary surveys. The above analysis indicates that there
OCR for page 77
WHY REDESIGN THE CE? 77
are issues with both the CE Diary and CE Interview surveys, leading to the
need for them to be assessed and redesigned. As a result, the panel reached
this conclusion:
C
onclusion 5-1: Underreporting of expenditures is a major quality
problem with the current CE, both for the Diary survey and the In-
terview survey. Small and irregular purchases, categories of goods for
specific family members, and items that may be considered socially
undesirable (alcohol and tobacco) appear to suffer from a greater per-
centage of underreporting than do larger and more regular purchases.
The Interview survey, originally designed for these larger categories,
appears to suffer less from underreporting than does the Diary survey
in the current design of these surveys.
MEASUREMENT DIFFERENCES BETWEEN
THE INTERVIEW AND DIARY
Before examining potential sources of response errors in the Interview
survey and Diary survey separately, this section considers whether these two
independent surveys, as currently designed, are inherently comparable in
the information that each collects. In the section above, the panel raised its
concern about basic comparability of expenditure categories when compar-
ing to the PCE. Here, the report explores another aspect of comparability.
It is important to remember the purposes for which the two surveys
were originally designed. The Diary was designed to gather information
on the myriad of frequent, small purchases made on a daily basis. These
items include food for home consumption and other grocery items such as
household cleaning and paper products. The Diary also is the source of
expenditures for some clothing purchases, small appliances, and relatively
inexpensive household furnishings, as well as the source of estimates on
food away from home. The Interview, on the other hand, was designed to
produce estimates for regular monthly expenditures like rent and utilities.
It was designed to capture major expenditures, including those for large ap-
pliances, vehicles, major auto repair, furniture, and more expensive clothing
items. Given the very different purposes of the two surveys, it is not surpris-
ing that they have entirely different designs and, hence, different problems.
Differences in Questions, Context, and Mode
A broad base of literature in survey research has identified many factors
that can independently affect the accuracy of answers to survey questions.
Some of the most important include the following:
OCR for page 78
78 MEASURING WHAT WE SPEND
• Different question wording is likely to produce different responses
(Groves et al., 2004).
• The context in which questions are asked—for example, the pur-
pose of the survey as it is explained to the respondent and the
order in which questions are asked—influences what respondents
will report (Tourangeau and Smith, 1996; Tourangeau, Rips, and
Rasinski, 2000).
• Survey mode influences answers. For example, the literature dem-
onstrates that in-person interviews are more likely to produce so-
cially desirable answers that put the respondent in a more favorable
light (Dillman, Smyth, and Christian, 2009).
• For self-administered diaries, the visual layout and design can have
a dramatic effect on respondent answers (Christian and Dillman,
2004; Tourangeau, Couper, and Conrad, 2007).
These influences are realized as respondents go through the well-
established cognitive process of comprehending the question and conclud-
ing what they are being requested to do, retrieving relevant information
for formulating an answer, deciding which information is appropriate and
adequate, and reporting. It is well documented that errors may occur at
each of these stages (Tourangeau, Rips, and Rasinski, 2000).
As noted earlier, Bee, Meyer, and Sullivan (2012) found that the level
of reported expenditures for certain purchases are consistently different in
the Interview survey and the Diary survey. Although the Interview survey
generally yields larger expenditures, these differences are not consistently
in the same direction. For example, food purchased away from home, pay-
ments for clothing and shoes, and purchases for alcoholic beverages are
greater from the Diary. Expenditures for rent and utilities, food at home,
and gasoline and other energy goods are larger from the Interview. Some
argue that larger is simply more accurate, but that may not be the case. The
panel has not said that either approach or type of question is inherently
better or worse. However, it is appropriate to illuminate these differences
more closely.
Different questions are asked in the Interview and the Diary surveys,
and these different questions are also asked in different survey contexts.
To illustrate this, consider the category of food and drink at home to see
how each survey collects this information. This is one of the categories for
which the Diary was designed.
The Interview survey asks the following questions:
• What has been your or your household usual WEEKLY expense
for grocery shopping? (Include grocery home-delivery service fees
and drinking water delivery fees.)
OCR for page 98
98 MEASURING WHAT WE SPEND
General Instructions
Fill out this diary for an entire week, writing down EVERYTHING you and the
people on your list spend money on each day – the products you buy, the
services you use, the household expenses you have during the week – no matter
how large or small they are.
We recommend that you record your expenses each day.
Think about where you went and what you’ve done.
Talk to the people on your list every day to find out how they spent their money.
Include payments by:
Cash Credit/Debit Card Automatic Withdrawal/Payroll Deduction
Check Money Order Store Charge Card
Food Stamps WIC Voucher Grocery Certificate
Keep receipts and other records so that you will remember to record what you
bought or paid for. Use the pocket at the back of the diary to store them.
Some record types include:
Receipts Bank Statements Catalog/Internet Order Invoices
Utility Bills Telephone Bills Credit Card Statements
Pay Stubs
Include items that you bought for people who are not on your list, such as gifts.
Refer to the flap Refer to the flap
attached to the attached to the
front cover for back cover for answers to
Examples of Expenses. Frequently Asked Questions.
Do NOT record:
♦ Expenses of people on your list while they were away from home overnight.
♦ Business or farm operating expenses
♦ Sales tax for:
Part 2. Food and Drinks for Home Consumption
Part 3. Clothing, Shoes, Jewelry, and Accessories
Part 4. All Other Products, Services, and Expenses
2 FORM CE-801 (7-1-2005)
FIGURE 5-5 Diary booklet, pageFig5-5.eps
2.
SOURCE: Bureau of Labor Statistics.
OCR for page 99
WHY REDESIGN THE CE? 99
Recording Expenditures May Be Problematic
The diary instructions focus on recording expenditures each day during
the diary week, separately for different categories. A total of 28 pages of
the booklet are laid out by “day” and consist of labeled tables for recording
household expenditures made in each of four categories:
1. Food and Drinks Away from Home
2. Food and Drinks for Home Consumption
3. Clothing, Shoes, Jewelry, and Accessories
4. All Other Products, Services, and Expenses
Nine additional pages are included for reporting any information that will
not fit on the individual day pages.
The diary-keeper is to record each expenditure on the correct page by
day and expenditure category. For each item recorded, the form asks for a
description of the item, the cost, plus additional information that differs for
each of the four expenditure categories. For example, the tables for Cloth-
ing, Shoes, Jewelry, and Accessories ask for additional information on the
individual for whom the item was purchased: gender, age, and whether the
person was a member of the household. The tables for Food and Drinks
Away from Home also ask where the item was purchased, whether the
item included alcoholic beverages, and a breakout of the cost of any such
alcohol.
Research has shown that respondents are influenced by much more
than words on how to complete questionnaires; a mostly linear path with
guidance from numbers, graphics, and symbols also helps to instruct re-
spondents on how a questionnaire (or diary) is designed to be completed
(Dillman, Smyth, and Christian, 2009). The current in-person delivery and
retrieval process is designed to compensate for some of these problems.
The field representative may look at the receipts collected by the household
respondent and ask other questions to find out whether all expenditures
have been recorded. Realization at the initial visit that the interviewer will
call mid-week and return to pick up the diary at the end of the week would
seem to encourage respondents to think about their daily expenditures and
be able to recall and report them at the end of the diary week contact.
However, this mitigation process is not always followed.
Other general problems can occur with the diary. Some shopping trips
require complex reporting of many and varied items on the diary forms.
For example, a major grocery-shopping trip may take considerable time and
effort to record. Each item purchased must be itemized separately with a
description and cost. (The respondent has to figure out whether to record
before or after the food is put away.) Receipts are often limited to abbrevia-
OCR for page 100
100 MEASURING WHAT WE SPEND
tions and codes that may not be understandable to the person who made
the purchase and/or is completing the diary.
Part of the diary placement visit is for the field representative to “size
up” the respondent as to whether he or she understands how to complete
the diary and seems committed to do so. If the respondent does not appear
to understand the instructions and the use of the daily recording forms,
some field representatives will revert to an alternative approach and ask
such a respondent to merely keep all of the household receipts for the
week’s expenditures in a pocket of the inside back cover. On the next visit
a week (or two weeks) later, the field representative and the respondent
will go through the receipts and fill in the diary forms together. The panel
learned that this approach is likely used for a significant number of house-
holds in which the respondent finds keeping the diary too difficult to do.
C
onclusion 5-13: It is likely that the current organization of recording
expense items by “day of the week” makes it more difficult for some
respondents to review their diary entries and assess whether an expen-
diture has been missed.
Reporting Period for the Diary Survey
The Diary survey has a one-week reporting period, followed imme-
diately by a second wave also consisting of a one-week reporting period.
The Diary survey was conceived as a vehicle for collecting smaller and fre-
quently purchased items that were unlikely to be reported accurately over
a three-month recall period. However, in practice, the Diary collects a wide
variety of expenditure items. Since many types of expenditures are made
infrequently, and others are not purchased in the same amount each week,
Diary expenditure estimates for these variables are likely to be more vari-
able than those from the Interview survey with its three-month reporting
period. For example, Bee, Meyer, and Sullivan (2012) found that for 2010
the weighted average coefficient of variation of spending reports on 35 cat-
egories of expenditures common to both the Interview and Diary was nearly
60 percent higher for a typical Diary response than for a typical Interview
response.9 One reason is that, in 2010, close to 10 percent of weekly diaries
that were considered as valid observations reported no in-scope spending at
all. (About 75 percent of these reports were out-of-scope because the fam-
ily was on a trip for the week.) Consequently, a larger number of weekly
diaries is required to equal the precision of the quarterly interviews.
9
Their comparisons adjusted for the different sample sizes of the Diary and Interview sur-
veys. The coefficient of variation is the standard error of a mean or other statistic expressed
as a percentage of the statistic.
OCR for page 101
WHY REDESIGN THE CE? 101
A related conceptual issue is that the short reference period in the cur-
rent Diary survey may be too short to accurately measure an individual’s
normal spending pattern. While these errors may average out in the calcula-
tion of means, several important uses of the CE require the measurement
of the distribution of spending.
Proxy Reporting in the Diary Survey
Respondents are asked to consult with other members of the household
during the week and to report expenditures for all members. The field rep-
resentative lists the names of the members on the inside cover of the diary.
These instructions are aimed at encouraging communication between the
person who agrees to complete the diary and others in order to facilitate
accurate reporting. As stated earlier, the diary mode does provide more
opportunity to confer with other household members over the week than
there is within a rushed recall interview. However, there are still issues with
proxy reporting in today’s households.
The changing structure of U.S. households, in which the adults in that
household are more likely to have separate incomes and expenditure pat-
terns, means that unless deliberate communication occurs expenditures may
be underreported. In addition, household members are not always open
with each other about what they have purchased or how much it cost. If
a member of the household does not want the person most responsible for
completing the diary to know of the expenditure or its cost (e.g., a teenager
downloading a new video game, the cost of an anniversary gift, or payment
of a parking ticket), the diary will probably miss the expense.
C
onclusion 5-14: Although the diary protocol encourages respondents
to obtain information and record expenditures by other household
members during the two weeks, it is unclear how much of this happens.
NONRESPONSE
Comparison of Response Rates
In calculating response rates on the CE Interview survey, BLS uses out-
come information from each household for each wave (waves two through
five) as independent observations in the survey. For the Diary survey, BLS
counts each week of the two weeks of diary reporting by a household as
an independent observation. The “CE program defines the response rate
as the percent of eligible households that actually are interviewed for each
survey” (Johnson-Herring and Krieger, 2008, p. 21). These calculations
exclude cases where the household is ineligible.
OCR for page 102
102 MEASURING WHAT WE SPEND
BLS sometimes refers to the rates described by Johnson-Herring and
Krieger as the “collection rates” since they are computed by the Census
Bureau immediately following data collection. In post-collection processing,
BLS removes some data records because they are found to have limited data
entries and expenditure dollar amounts. BLS then recalculates the response
rates, and sometimes refers to these adjusted rates as “estimation rates.”
Johnson-Herring and Krieger do not discuss these adjustments in their
description of methodology. In comparing the “collection rates” with the
“estimation rates” one sees that the adjustments affect the Diary rates more
than the Interview rates. BLS generally provides the “estimation rates” as
the response rates to their microdata users. After some consultation with
BLS, the panel has concluded that the adjusted “estimation rates” more
closely describe the usable response, and has decided to use those as the
response rates for the purpose of this report.
Thus, as reported in Chapter 3, the CE Interview survey had a response
rate (estimation rate) in 2010 of 73 percent, slightly ahead of the Diary
survey, which had a response rate of 72 percent.
Both surveys have experienced declines in response rates over time,
a problem that has plagued most government household surveys. The re-
sponse rates (estimation rates) for the Diary survey have been slightly lower
than those for the Interview survey (see Figure 5-6). The CE is a burden-
some survey, and the overall response rate is lower than several well-known
but less burdensome surveys such as the Current Population Survey (92%)
and the CPS Annual Demographic Survey (80% to 82%).
78
76
74
Percent
72 Interview
Diary
70
68
66
2005 2006 2007 2008 2009 2010
Year
FIGURE 5-6 Response rates (estimation rates) for Consumer Expenditure Interview
and Diary surveys.
Fig5-6.eps
SOURCE: Bureau of Labor Statistics, data table provided to panel.
OCR for page 103
WHY REDESIGN THE CE? 103
However, the CE’s response rate is comparable to consumption surveys
in other countries, which have experienced similar declining response rates
during the period between 1990 and 2010. Figure 5-7 depicts the response
rate to the CE (United States) compared to response rates for compa-
rable consumption surveys in Australia, Canada, and the United Kingdom
(Barrett, Levell, and Milligan, 2011). The CE response rate is somewhat
higher than the others.
Panel Attrition
Given that the CE Interview survey uses a rotating panel design with
five waves of data collection per panel, an additional response concern
relates to attrition over the life of a panel (Lepkowski and Couper, 2002).
King et al. (2009) studied the pattern of participation in the CE. In this
study, they looked at a single cohort first interviewed in April–June 2005.
Among the households that completed the first wave interview, 78.6 per-
cent were classified as complete responders (data for all five waves were
captured), 14.1 percent were classified as attritors (completed one or more
FIGURE 5-7 Response rates for consumption surveys across four western countries.
SOURCE: Barrett, Levell, and Milligan (2012).
Fig5-7.eps
bitmap
OCR for page 104
104 MEASURING WHAT WE SPEND
interviews before dropping out), and 7.3 percent were classified as inter-
mittent responders (completed at least one but not all waves of data col-
lection). It is not clear whether the response rates reported above are for
all cohorts in a particular quarter (i.e., averaging across different panels),
but the drop-off after the first wave of data collection raises both concerns
and opportunities.
A key concern is that the decision to attrite may be influenced by the
experience of the first wave, and that experience may be different depending
on the number and types of expenditures reported in that wave. If this is
the case, the attrition can potentially affect the estimation of expenditures
during later waves. In other words, to what extent is later wave participa-
tion affected by reported expenditures (or expenditure patterns) in the first
interview, holding constant the usual demographic variables that are used
in weighting adjustments? The panel is not aware of research exploring this
potential source of bias.
A key opportunity for the future: BLS can use households that provide
partial data (i.e., attritors or intermittent responders), along with their level
and pattern of expenditures, in the adjustment for missing waves. It is the
panel’s understanding that the nonresponse adjustments employed by BLS
use post-stratification adjustment to population controls in a cross-sectional
adjustment and do not make use of the household’s expenditure data from
other interviews. While this may be hard to do given the need to produce
quarterly estimates, the panel nature of CE interview data provides a rich
source of information to better understand patterns of nonresponse and
potential nonresponse bias. Such information can also be used to target ef-
forts to minimize nonresponse error in responsive design strategies (Olson,
2011).
The Diary survey also incorporates a panel design that interacts with
the issue of attrition. Each selected household is asked to complete two
waves of data collection, each wave being a one-week diary. Even though
the waves are adjacent weeks from the same households, the estimation
process considers each wave as an independent collection. A general is-
sue with diary surveys is that compliance tends to deteriorate as the diary
period progresses, that is, there is high compliance at the beginning of the
period and less toward the end. The panel has not seen specific research
on this issue for the CE Diary, but it is likely that there is less compliance
during the second wave than in the first wave. This may be particularly true
in households in which both diaries are placed at the same time without an
intervening visit from the field representative. Without adjustment during
the estimation process, it is possible that the lower reported expenditures
in wave 2 will bring down the overall level of expenditures reported from
the Diary survey.
OCR for page 105
WHY REDESIGN THE CE? 105
Disproportionate Nonresponse
Having a high proportion of the initially sampled individuals respond
to the survey may lead to higher quality data; however, it is neither a suf-
ficient nor necessary condition. King et al. (2009) reported on four studies
to examine potential nonresponse bias in the CE. One of these studies was
discussed above. Even though the studies showed that the nonresponse was
not missing completely at random (African Americans are underrepresented
among the respondents, and those over 65 years old are overrepresented),
they did not find measurable bias in estimating total expenditures due to
the nonresponse.
A more recent study suggests a potential bias related to underrepre-
sentation of the highest income households (Sabelhaus et al., 2011). The
authors began by comparing CE estimates of income and the distribution
of income with other relevant data sources such as the Current Population
Survey, the Survey of Consumer Finances, and tax return–based datasets
from the Statistics of Income. These comparisons show that the CE has
relatively fewer completed surveys from households with income $100,000
or greater. The authors also showed that the average income estimated per
household for this highest income group is substantially below the esti-
mated average from these other sources.10
The authors demonstrated by comparing the CE sample to geocoded
tax data that higher income units are less likely to respond on the CE and
are underrepresented even after weighting, while units at lower levels of
income are adequately represented. While there is not a large difference in
the total population counts, the underrepresentation of the upper income
groups could lead to an undercount of income in the higher income levels
and consequently also to understating the aggregate level of spending. They
have concern because these high-income households may have a different
mix of expenses when compared to other households. This differential
could affect the relative budget shares calculated for the CPI. The authors
speculate that a significant portion of the difference between CE aggregate
spending and PCE spending might be accounted for by the nonresponse of
higher income consumer units. They concluded that:
Only the very highest income households seem to be under-represented in
the Consumer Expenditure Survey (CE), and the mystery of overall under-
reported spending in the CE is not fully explained by that shortcoming.
At least some of the shortfall in aggregate CE spending seems attributable
10
Greenlees, Reece, and Zieschang (1982) carefully analyzed nonignorable income nonre-
sponse (nonresponse related to the variable being imputed) using matched microdata from the
CPS and IRS. They demonstrated clearly the problem that nonignorable nonresponse imposes.
OCR for page 106
106 MEASURING WHAT WE SPEND
by under-reported spending by at least some CE respondents. (Sabelhaus
et al., 2012, p. 21)
C
onclusion 5-15: Nonresponse is a continuing issue for the CE as it
is for most household surveys. The panel nature of the CE is not suf-
ficiently exploited for evaluating and correcting either for nonresponse
bias in patterns of expenditure or for lower compliance in the second
wave of the Diary survey. Nonresponse in the highest income group
may be a major contributing factor to underestimates of expenditures.
ISSUES REGARDING NONEXPENDITURE DATA
The use of the CE data for economic research provides the impetus for
collecting data on demographics, income, investment, and savings at the
household level in the CE. The panel identified several issues in the current
CE with these types of data that make the research process more difficult.
Reporting Periods for Income, Employment, and Expenditures
Ideally, researchers would like to have expenditures, income, and em-
ployment for responding households collected over the same reporting
periods. The current Interview survey collects expenditure information for
each of four quarters during a year. Income and employment information
is collected for the previous 12 months, but only during the second and
fifth interview. The current Diary survey collects expenditure data for two
weeks, but income and employment data for the previous 12 months. The
inconsistency of the collection periods for these different types of data can
make it difficult to reconcile large differences in expenditure and income at
the household level when these differences occur. While researchers have
expressed the importance of having all data (including expenditures and
income) collected over the same reference period, some panel members
have expressed the opinion that it is also important to allow respondents
to report for a period for which they can do so most accurately.
Demographics and Life Events
Examining the impact on household spending due to a variety of
stimuli is important in the economic research done using the CE data. When
changes occur, it is difficult to reconcile changes in expenditure and income
without information about whether an individual household has undergone
a major life event (e.g., marriage, divorce, or change in employment status)
sometime during the year. The current CE collects relatively limited infor-
OCR for page 107
WHY REDESIGN THE CE? 107
mation on these major life events. For example, the CE Interview survey
does not collect changes in job status (and the reason for those changes)
between survey waves.
Linking with Administrative Data Sources
The ability to link CE data to relevant administrative data sources (such
as IRS data or data on program participation) could provide additional
richness for economic research as well as providing potential avenues to in-
vestigate the impact of nonresponse on the survey results. Data confidential-
ity procedures have presented barriers to such linkage. Some success in this
area has been achieved by some other federal surveys that ask respondents’
permission to match their survey responses with administrative data. Some
surveys have experimented with an “opt out” version, where respondents
can say that they do not want the matching to occur (Pascale, 2011). These
strategies would be useful to try for the CE.
C
onclusion 5-16: For economic analyses, data on income, saving, and
employment status are important to be collected on the CE along with
expenditure data. Aligning these data over time periods, and collecting
information on major life events of the household, will help researchers
understand changes in income and expenditures of a household over
time. Linkage of the CE data to relevant administrative data (such as
the IRS and program participation) would provide additional richness,
and possibly provide avenues to investigate the effect of nonresponse.
SUMMARY OF REASONS TO REDESIGN THE CE
This chapter specifically addresses the issues upon which the panel
bases its recommendations in Chapter 6 to redesign the CE. The CE surveys
appear to suffer from underreporting of expenditures. This conclusion is
based on comparison of the CE estimates to several sources, but primar-
ily to the PCE. The panel does not consider the PCE as “truth,” but does
consider the results informative. The comparisons were made considering
categories of expenditures with comparable definitions between the CE and
PCE. The overall pattern indicates that the estimates for larger items from
the CE are closer to their comparable estimates from the PCE. The current
Interview survey estimates these larger items more closely to the PCE than
does the current Diary survey. For 36 smaller categories, neither the Inter-
view survey nor the Diary survey consistently produces estimates that have
a high ratio compared to the PCE. Thus, the panel concluded that there are
underreporting issues with both the CE Diary and CE Interview surveys and
OCR for page 108
108 MEASURING WHAT WE SPEND
proceeded to review response and nonresponse issues that could contribute
to this underreporting.
Before examining sources of potential response errors in the Interview
survey and Diary survey separately, the panel observed that the mode,
questions, and context used in the Interview survey and Diary survey are
quite different. Therefore, one ought to expect differences, both in issues
that need to be addressed and in the estimates obtained. It is therefore not
surprising that different amounts are reported in the Interview and Diary
in this situation, and that these differences are not always in the same
direction.
The panel examined potential sources of response error in both surveys.
They concluded that both the Interview and Diary surveys have issues with:
• motivation of respondents to report accurately,
• structure of data collection instruments that leads to reporting
problems,
• recall or reporting period, and
• proxy reporting.
Additionally, they expressed concern about the infrequent use of records in
the Interview survey that is less relevant to a concurrent mode of collection.
The Interview and Diary surveys have similar response rates of 73 to 72
percent. These rates are lower than for some important federal surveys, but
appear to be better than consumer expenditure surveys in some other west-
ern countries. There is concern about attrition within the panel designs for
both surveys, as well as concern about the effect of disproportionate non-
response from the segment of the population in the highest income groups.
In sum, there are response and nonresponse issues with both the concur-
rent (Diary) and recall (Interview) collection of data in the CE as currently
implemented. The panel does not conclude that one method is intrinsically
better or worse than the other. However, it does believe that different ap-
proaches to these methods have the potential to mitigate these problems.