The Consumer Expenditure Surveys (CE) are the only source of information on the complete range of consumers’ expenditures and incomes in the United States, as well as the characteristics of those consumers. The CE consists of two separate surveys—a national sample of households interviewed five times, at three-month intervals; and a separate national sample of households that complete two consecutive one-week expenditure diaries. For more than 40 years, these surveys, the responsibility of the Bureau of Labor Statistics (BLS), have been the principal source of knowledge about changing patterns of consumer spending in the U.S. population.
In February 2009, BLS initiated the Gemini Project, the aim of which is to redesign the CE surveys to improve data quality through a verifiable reduction in measurement error with a particular focus on underreporting. The Gemini Project initiated a series of information-gathering meetings, conference sessions, forums, and workshops to identify appropriate strategies for improving CE data quality. As part of this effort, BLS requested the National Academies’ Committee on National Statistics (CNSTAT) to convene an expert panel to build upon the Gemini Project by conducting further investigations and proposing redesign options for the CE surveys.
The charge to the Panel on Redesigning the BLS Consumer Expenditure Surveys includes reviewing the output of a Gemini-convened Data User Needs Forum and Survey Methods Workshop and convening its own Household Survey Producers Workshop to obtain further input. In addition, the panel was requested to commission options from contractors for consideration in recommending possible redesigns. The panel was further
asked by BLS to create potential redesigns that would put a greater emphasis on proactive data collection to improve measurement of consumer expenditures. This report summarizes the deliberations and activities of the panel. As summarized below and described more fully in its report, the panel drew four conclusions about the uses of the CE and 16 conclusions about why a redesign is needed. The panel also made 12 recommendations about future directions.
PURPOSES OF THE CE SURVEYS
The CE serves several important purposes. The most visible is for calculating the Consumer Price Index (CPI), one of the most widely used statistics in the United States. Calculating the CPI involves multiple data sources. The CE data provide budget shares (weights) for detailed expenditure categories. Much of this detail is not available elsewhere.
Another important use is to provide data critical for administering certain federal and state government programs. For the continuing administration of many of these programs, the CE is the only continuing source of data with sufficient information on households’ demographic characteristics, spending, and income.
In addition, the completeness of the CE in measuring household demographics and consumer expenditures, in combination with repeated measurement over a year for the same households, makes it a cornerstone for policy analysis and economic research. Understanding the differential effects of policies and events on consumer expenditures of all types, and the consequences for people of different ages, races, and ethnicities, sizes of households, and regions, relies upon the CE.
WHY THE CE INCLUDES TWO SURVEYS
The modern version of the CE, with its two independent surveys, was first fielded in 1972–1973. It has been conducted annually since 1980 with the same underlying design concept—different methods of data collection to collect different kinds of data.
The Interview survey was designed to collect expenditures that could be recalled for over three months. The focus was on large expenditures, such as property, automobiles, and major appliances, as well as regular expenditures, such as rent, utility bills, and insurance premiums. The Diary survey, on the other hand, was designed to obtain expenditures for smaller, frequently purchased items.
Over time, however, the Interview survey began to collect information on small, frequently purchased items, while the Diary now collects information on many larger items. Thus, the Interview and Diary now collect
information that allows estimates for certain expenditures to be made from either source, using different question wordings and time periods. BLS uses only data from a single source in its published estimates, selecting the “best” source for each item.
Approximately 7,100 interviews, each of which averages about 60 minutes, are conducted each quarter in the Interview survey, with five interviews for each household. Although most data are collected in household visits, an increasing proportion of the later interviews rely completely or partly upon telephone interviews. One-fifth of the sample is new each quarter, with a corresponding one-fifth of households completing the five-interview sequence. The Diary survey collects usable data from 7,100 households per year, each keeping two one-week diaries. Diary placements occur during 52 weeks of the year, with approximately 273 diaries being completed each week.
Most of the cost is associated with the Interview survey, which produces about 36,000 completed interviews per year, compared to about 14,000 one-week diaries. The “total” data collection cost for the CE surveys in 2010 was $21.2 million, with the Interview survey costing $17 million, or about 80 percent of the total.
THE PANEL’S INVESTIGATION
The panel received input from a wide variety of sources. Investigations conducted by the Gemini Project provided critical background. Several panel members themselves use CE microdata. The panel also reviewed published research and held a session at the 2011 CE Microdata Users’ Conference. The panel also studied the complexities of the CPI program and how the CE supports it.
Based on these investigations, the panel makes the following conclusions about the use of the CE. (The numbers represent the location of the conclusion in the full report; thus, more background on the conclusions below is in Chapter 4.)
Conclusion 4-1: The CPI is a critical program for BLS and the nation. This program requires an extensive amount of detail on expenditures, at both the geographic and product level, in order to create its various indices. The CPI is the current driver for the CE program with regard for the level of detail it collects. The CPI uses over 800 different expenditure items to create budget shares. The current CE supplies data for many of these budget shares. However, even with the level of detail that it currently collects, the CE cannot supply all of the budget shares used by the CPI. There are other data sources from which the CPI currently generates budget shares.
Conclusion 4-2: The CPI does not utilize the panel nature of the current CE. Instead the national and regional estimates employed by the CE assume independence of households between quarters on the Interview survey, and independence between weeks on the Diary survey.
Conclusion 4-3: The administration of some federal programs depends on specific details collected from the CE. There are currently no other available sources of consistent data across years for some of these programs.
Conclusion 4-4: Economic researchers and policy analysts generally do not use CE expenditure data at the same level of detail required by the CPI. More aggregate measures of expenditures suffice for much of their work. However, many do make use of two current features of the CE microdata: an overall picture of expenditures, income, and household demographics at the individual household level; and a panel component with data collection at two or more points in time.
Most panel members experienced the CE Interview, the CE Diary, or both as a respondent. These “field” experiences provided broad understanding when connected with insight from top methodological researchers through the Gemini Project’s CE Methods Workshop (December 2010). In addition, the panel studied findings from periodic debriefings of field representatives on how respondents formulate answers (e.g., use of records vs. no records) and the challenges respondents face in answering accurately.
The panel’s Household Survey Producers Workshop (June 2011) was organized around several critical topics, including consumer expenditure surveys in other countries and survey design experiences on other topics and issues. The workshop brought together U.S. and international presenters; university, private-sector, and government-sector experiences; and data collection experiences on a myriad of topics.
The next step was to commission two groups of researchers to develop potential redesigns for the CE surveys. Their proposals encouraged outside-the-box thinking on new collection strategies, technologies, and procedures. The two proposals were presented at a Redesign Options Workshop organized by the panel in October 2011.
Thus, the panel was challenged to bring together the diverse experiences of data users: from those who use it for the CPI to those who study consumer behavior. It was further challenged by the work of statisticians and survey methodologists who design sampling strategies and survey questionnaires to improve data quality in varied situations. In addition, it was challenged by the practical requirements of data collection and new ideas to improve data quality.
The CE surveys are long and arduous. In the Interview survey, the typical respondent answers without being able to consult other members of the household and only infrequently refers to records. The level of detail exceeds what a person can recall for a three-month period. In the Diary survey, respondents are asked to remember to record details of many small purchases and to list each expenditure separately in a complicated booklet.
In addition, consumer spending has changed dramatically over the past 30 years through such things as online shopping, electronic banking, payroll deductions, and greater use of debit and credit cards. Shopping in “big box” stores that sell a huge variety of items challenges people’s ability to recall the amount spent on specific categories of expenditures.
Through comparisons, reported expenditures in both the Interview and Diary surveys tend to be lower than the amounts suggested by the Personal Consumption Expenditures (PCE) data from the Bureau of Economic Analysis. Although there are important conceptual differences between the CE and PCE, the differences suggest both CE surveys underreport consumer expenditures.
Conclusion 5-1: Underreporting of expenditures is a major quality problem with the current CE, both for the Diary survey and the Interview survey. Small and irregular purchases, categories of goods for specific family members, and items that may be considered socially undesirable (alcohol and tobacco) appear to suffer from a greater percentage of underreporting than do larger and more regular purchases. The Interview survey, originally designed for these larger categories, appears to suffer less from underreporting than does the Diary survey in the current design of these surveys.
Estimates derived from the Interview and Diary differ significantly for many expenditure categories in part because many questions are posed in quite different ways. For example, the Interview asks for an estimate of the household’s weekly expense for grocery shopping and then for portions spent for nongrocery items. In contrast, the Diary asks for a listing of individual food items purchased for home consumption during a specific week of the year.
Conclusion 5-2: Differences exist between the current Interview and Diary reports of expenditures. Differences in questions, context, and mode are likely to contribute to these differences. The error structures for the two surveys, and for different types of questions in the Interview survey, may be different. Because of these differences we cannot
conclude whether a recall interview or a diary is inherently a better mode for obtaining the most accurate expenditure data across a wide range of items. Both have real drawbacks, and a new design will need to draw from the best (or least problematic) aspects of both methods.
Sources of Underreporting in the Interview
The panel’s review suggests underreporting of expenditures may stem from a number of considerations, rather than a single cause. Asking respondents to spend more than five hours over the course of a year answering detailed questions about their expenditures is a substantial burden. The field representative, concerned about the respondent’s willingness to agree to additional interviews, may be hesitant to press too hard for accurate recall or the use of records. Under these conditions, it seems likely that the field representative and respondent both benefit from keeping the interview as short and pleasant as possible.
Conclusion 5-3: Motivational factors of both the respondent and field representative appear to negatively influence the quality of the CE Interview data. This leads the panel to the judgment that a changed incentive and support structure for both respondents and field representatives will be needed for a future CE redesign to motivate high-quality reporting and reduce fatigue.
It becomes apparent to Interview respondents that answering “Yes” to a particular question (e.g., “Did you purchase any pants, jeans, or shorts?”) leads to being asked a number of detailed, follow-up questions. The respondent is then asked whether they purchased other “pants, jeans, or shorts,” and the cycle begins again.
Conclusion 5-4: The current structure of the Interview questionnaire cycles down through global screening questions, and asks multiple additional questions when the respondent answers “Yes” to a screening question. As this cycle repeats itself, a respondent “learns” and may be tempted not to report an expenditure in order to avoid further questions.
Recall of specific detailed expenditures is further complicated because the item may be only one of several items in a single purchase. The diverse ways of purchasing, paying, or authorizing payment and the challenge of connecting specific expenditures to any payment records seem likely to encourage estimation rather than exact reporting.
Some questions on the Interview survey are particularly difficult, such as asking respondents to report exact amounts of savings or value of assets now compared to one year earlier and exact dates of purchases for specific items. Even respondents who keep meticulous records may find that their records are not organized to allow honest answers to some questions.
Conclusion 5-6: Some questions on the current CE Interview questionnaire are very difficult to answer accurately, even with records.
It is not possible for many people to recall exact dates and amounts of expenditures over three months. Whereas certain items (e.g., house payments) may not vary and thus can be remembered, others (e.g., clothing and food away from home) may vary dramatically.
Conclusion 5-7: Three months is long for accurate recall of many items on the CE Interview survey. This situation is exacerbated by the ancillary details that are collected about each recalled expense. Errors of omission are likely to occur, and are a contributing factor to the underreporting of expenditures on this survey. Short recall periods, however, may produce more variability in the estimates and provide difficulties for economic research.
Field representatives report the use of records in the interview varies greatly. However, the proportion of respondents who never or only sometimes use records far exceeds the proportion that always or almost always does. Records are used even less when the interview is conducted by telephone.
Conclusion 5-8: The use of records is extremely important to reporting expenditures and income accurately. The use of records on the current CE is far less than optimal and varies across the population. A redesigned CE would need to include features that maximize the use of records where at all feasible and that work to maximize accuracy of recall when records are unavailable.
Field representatives attempt to interview the person most knowledgeable about expenditures. Most interviews do not involve others, and the respondent may not know certain expenditures made by other adult or teenage household members.
About one-third of the CE interviews, especially the later ones, are completed by telephone. These interviews result in fewer positive answers to screener questions and do not benefit from an information booklet designed to encourage recall when the field representative visits the household.
Conclusion 5-10: Telephone interviews appear to obtain a lower quality of responses than the face-to-face interviews on the CE, but a substantial part of the CE data are collected over the telephone.
Sources of Underreporting in the Diary
The Diary survey uses a proactive process that involves instructing respondents to report expenditures by all members of the household, asking that they record expenditures daily, and providing detailed instructions on how to complete the diary. However, evidence that important categories of expenditures are less well reported in the Diary than the Interview suggests the full potential of the Diary is not being realized. As with the Interview, several factors may affect the accuracy of Diary reporting.
Diary reporting asks for expenditures in four categories, with each entry asking for multiple pieces of information, placing considerable burden on respondents. Many find it time-consuming and difficult to partition receipts into the requested categories. Motivation to complete the diary appears to decline over the two-week period.
Conclusion 5-11: A major concern with the Diary survey is that respondents appear to suffer diary fatigue and lack motivation to report expenditures throughout the two-week data collection period, and especially to go through the process of recording all items in a large shopping trip.
Field representatives report some respondents see the 44-page diary as too difficult to complete. They are asked by the field representative to collect receipts, to be recorded in the diary during the second household visit.
Conclusion 5-12: A lot of information is conveyed to the diary respondent in a short amount of time. The organization of the diary booklet may result in considerable frustration among some individuals, who feel they cannot master the instructions. They choose instead to collect receipts and leave them for the field representative to enter during the follow-up visit.
The request to record expenditures by day and into broad categories requires respondents to flip pages back and forth as they move between instruction and recording pages. The diary lacks a clear navigational path, and the visual layout makes completing the diary difficult.
Conclusion 5-13: It is likely that the current organization of recording expense items by “day of the week” makes it more difficult for some respondents to review their diary entries and assess whether an expenditure has been missed.
The Diary survey has a short reporting period, which creates concerns regarding the collection of larger and less frequent expense items. Also it is difficult to get a picture of an individual household’s normal spending pattern in only two weeks.
It is not known to what extent respondents seek or are able to obtain expenditures from other household members. Even if others are willing to provide such information, they may not provide it to the respondent in a timely manner.
Conclusion 5-14: Although the diary protocol encourages respondents to obtain information and record expenditures by other household members during the two weeks, it is unclear how much of this happens.
Response Rates Have Declined
Response rates in 2010 for the Interview survey were 73 percent and for the Diary survey, 72 percent. These rates have declined over time, as have response rates to most household surveys. Low response from high-income groups is a concern for both surveys.
Conclusion 5-15: Nonresponse is a continuing issue for the CE as it is for most household surveys. The panel nature of the CE is not sufficiently exploited for evaluating and correcting either for nonresponse bias in patterns of expenditure or for lower compliance in the second wave of the Diary survey. Nonresponse in the highest income group may be a major contributing factor to underestimates of expenditures.
In assessing both the response and nonresponse issues, concerns exist about both the Interview and Diary modes. The panel did not conclude that one mode is intrinsically better or worse. However, it believes that different approaches to the use of both methods have the potential to mitigate these problems.
The ability to link CE data to relevant administrative data sources (such
as IRS data or data on program participation) could provide additional richness for economic research as well as providing potential avenues to investigate the impact of nonresponse on the survey results.
Conclusion 5-16: For economic analyses, data on income, saving, and employment status are important to be collected on the CE along with expenditure data. Aligning these data over time periods, and collecting information on major life events of the household, will help researchers understand changes in income and expenditures of a household over time. Linkage of the CE data to relevant administrative data (such as the IRS and program participation) would provide additional richness, and possibly provide avenues to investigate the effect of nonresponse.
PATHWAY TO A NEW SURVEY
The current detail and requirements imposed by the multiple and divergent CE data uses are difficult to satisfy efficiently within a single design, and the panel believes that tradeoffs must be made. The panel recommends a major redesign of the CE, with the first step to determine priorities among the data requirements of the many uses of the CE so tradeoffs can be made in a planned and transparent manner. Such prioritization is the responsibility of BLS and is beyond what would be appropriate or realistic for the panel to undertake.
Recommendation 6-1: It is critical that BLS prioritize the many uses of the CE data so that it can make appropriate tradeoffs as it considers redesign options. Improved data quality for data users and a reduction in burden for data providers should be very high on its priority list.
Recommendation 6-2: The panel recommends that BLS implement a major redesign of the CE. The cognitive and motivational issues associated with the current Diary and Interview surveys cannot be fixed through a series of minor changes.
The panel’s most effective course of action (prior to BLS’ priority-setting) is to suggest alternative designs to achieve different prioritized objectives. The panel developed three distinct prototype designs:
- Design A focuses on obtaining expenditure data at a detailed level through a “supported journal,” a diary-type self-administered data collection with tools that reduce recordkeeping while encouraging the entry of expenditures when memory is fresh and receipts available. Design A also has a self-administered recall survey to collect larger and recurring expenses. It collects a complete picture of household expenses over six months, with reporting periods varying by expense group.
- Design B uses a recall interview coupled with a short supported journal. It provides data for 96 expenditure categories (rather than the more detailed expenses provided by Design A) and collects complete expenditures over an 18-month period in three waves. It builds a dataset particularly useful for economic and policy analysis. This design also involves a small follow-on survey used to help understand measurement errors in the main survey.
- Design C incorporates elements of both Designs A and B. It collects the detail of expense items as in Design A while providing a household profile for six months. To do both, it uses a more complex sample design and employs modeling, collecting different information from different households.
The panel wishes to state clearly that evidence on how well each of the proposed prototypes would work is missing. The process of selecting a prototype or components of a prototype should be based not only on BLS’ prioritization of goals, but also on empirical evidence that the proposed procedures can meet those goals.
Recommendation 6-3: After a preliminary prioritization of goals of the new CE, the panel recommends that BLS fund two or three major feasibility studies to thoroughly investigate the performance of key aspects of the proposed designs. These studies will help provide the empirical basis for final decision making.
The panel offers the following recommendation that should be viewed in the context of BLS’ prioritization of the CE goals.
Recommendation 6-4: A broader set of nonexpenditure items on the CE that are synchronized with expenditures will greatly improve the quality of data for research purposes as well as the range of important issues that can be investigated with the data. The BLS should pay close attention to these issues in the redesign of the survey.
All three designs feature tablet computers with wireless phone cards as an essential ingredient. The report offers guidelines on the development and use of tablets in data collection, but stresses the untested assumptions that must be addressed before proceeding with using this tool. The panel also recognizes some households will need paper instruments. These instruments need to be redesigned to align with the tablets for multimode collection.
Recommendation 6-5: A tablet computer should be utilized as a tool in supported self-administration. However, a paper option should continue to be available for respondents who cannot or will not use a tablet computer. Visual design principles should be applied to redesigning the paper instrument in a way that improves the ease of self-administration and is aligned with the tablet modules.
The panel presents a general roadmap for BLS to follow to complete the redesign of the CE. First, it recommends BLS develop a targeted and tightly focused plan to achieve a redesign within the next five years, a roadmap that should be completed and made public within six months. The Gemini Project is in place to do this.
Recommendation 6-6: BLS should develop a preliminary roadmap for redesign of the CE within six months. This preliminary roadmap would include a prioritization of the uses of the CE, an articulation of the basic CE design alternative that is envisioned with the redesign, and a listing of decision points and highest priority research efforts that would inform those decisions.
Another key element of the prototypes is the use of incentives to motivate respondents to complete data collection and provide accurate data. The panel recommends an appropriate incentive program be a fundamental part of the future CE program. The report provides guidelines for developing an incentive structure, but the details can only be determined with appropriate CE-specific research.
Recommendation 6-7: A critical element of any CE redesign should be the use of incentives. The incentive structure should be developed, and tested, based on careful consideration of the form, value, and frequency of incentives. Serious consideration should be given to the use of differential incentives based on different levels of burden and/or differential response propensities.
The panel had numerous discussions about alternative data sources as a replacement or adjunct to collecting survey data. Although the use of such information at the aggregate or the micro (respondent/household) level holds great promise, the panel also recognized such use is accompanied by risk, particularly from a cost/quality tradeoff perspective. A serious risk and concern is over the continued availability of outside sources over time. The panel decided not to recommend specific external datasets in its three prototypes. However, the panel encourages BLS to continue to explore
Recommendation 6-8: BLS should pursue a long-term research agenda that integrates new technology and administrative data sources as part of a continuous process improvement. The introduction of these elements should create reductions in data collection and processing costs, measurement error, and/or the statistical variance and complexity of the CPI estimate. The agenda should address the robustness of new technology and a cost/quality/risk trade-off of using external data.
The panel points to the value of a strong internal BLS research staff. It recommends further development and expansion of their research capabilities in order to respond to the rapidly changing contextual landscape for conducting national surveys.
Recommendation 6-9: BLS should increase the size and capability of its research staff to be able to effectively respond to changes in the contextual landscape for conducting national surveys and maintain (or improve) the quality of survey data and estimates. Of particular importance is to facilitate ongoing development of novel survey and statistical methods, to build the capacity for newer model-assisted and model-based estimation strategies required for today’s more complex survey designs and nonsampling error problems, and to build better bridges between researchers, operations staff, and experts in other organizations that face similar problems.
Facing the demands of the immediate redesign of the CE and use of tablet computers, the panel recommends BLS find additional expertise through outside experts and organizations.
Recommendation 6-10: BLS should seek to engage outside experts and organizations with experience in combining the development of tablet computer applications along with appropriate survey methods in developing such applications.
Finally, as described above, all three prototypes propose procedures and techniques that have not been researched, designed, and tested. The prototypes are contingent upon new research undertakings. Much “relevant” background theory and research exist, for which the BLS research program and Gemini Project deserve praise. However, they do not provide enough specific answers for these new options. Considerable investment
must be made in researching elements of the proposed designs, to find specific procedures that are not only workable, but also most effective. These prototypes are not operationally ready—much targeted research needs to be done.
Recommendation 6-11: BLS should engage in a program of targeted research on the topics listed in this report that will inform the specific redesign of the CE.
Recommendation 6-12: BLS should fund a “methods panel” (a sample of at least 500 households) as part of the CE base, which can be used for continued testing of methods and technologies. Thus the CE would never again be in the position of maintaining a static design with evidence of decreasing quality for 40 years.
In summary, the CE performs an extremely important role in helping understand the consumption patterns of American households and more appropriately targeting critical policies and programs. The current CE design has been in place for four decades, and change is needed. The change should begin with BLS prioritizing the many uses of the CE so a new design can most efficiently and effectively target those priorities. The panel offers three prototype designs and considerable guidance in moving toward that ultimate redesign.