Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 68
5 Why Redesign the CE? T oday, all household surveys, including the Consumer Expenditure Surveys (CE), face well-known challenges. These challenges include maintaining adequate response from increasingly busy and reluctant respondents. In addition, more and more households are non-English speak- ing, and a growing number of higher-income households have controlled- access residences. Call screening and cell-phone-only households have made telephone contacts on surveys more difficult. Today’s household surveys face confidentiality and privacy concerns, a public growing more suspicious of its government, and competition from an increasing number of private as well as government surveys vying for the public’s attention (Groves, 2006; Groves and Couper, 1998). In the midst of these challenges for household surveys, the CE surveys stand out as particularly long and arduous. In the Interview survey, recall of many types of expenditures is likely to be imperfect. A typical respondent lacks records or at least the motivation to use them in answering the CE questions. The level of detail that is required in describing each purchase is daunting. In the Diary survey, respondents are asked to record the details of many small purchases in a complicated booklet. These demands can easily result in limited compliance and the omission of expenditures. Further exacerbating the problem, the CE faces the additional challenge that consumer spending has changed dramatically over the past 30 years, and it continues to change (Fox and Sethuraman, 2006; Kaufman, 2007; Sampson, 2008). When the CE was designed in the 1970s, there was no online shopping or options for electronic banking and bill paying. Over that 68
OCR for page 69
WHY REDESIGN THE CE? 69 time, shopping patterns have shifted from individual purchases at a variety of neighborhood stores to collective purchasing at “big box” stores such as Walmart, Target, and Costco that sell everything from meat to shirts, furni- ture, and motor oil under one roof. The CE surveys are cognitively designed to collect spending information based on the 1970s world of purchasing behaviors, and today’s consumers are unlikely to relate to that. Underreporting of expenditures is a long-standing problem with the CE as evidenced by a growing deviation from other data sources and by the re- sults of several studies. This underreporting appears to differ sharply across commodities, raising the possibility of differential biases in the Consumer Price Index (CPI) and the picture of the composition of household spending. This is the biggest concern with the CE program. The Panel on Redesigning the BLS Consumer Expenditure Surveys believes that there are a number of issues with the current design and implementation of the CE, and that collectively these problems lead to the underreporting of expenditures. This chapter documents this underreporting and then discusses the issues and concerns that the panel identified in its study of the CE. With that said, the panel understands that no survey is perfect. In fact all surveys are compromises between the need for specific data, the quality with which those data can be collected, and the burden and costs required to do so. The CE is no exception. It is the panel’s expectation that by ex- amining the issues with the current CE along with some alternative designs, a new and better balance can be found between data requirements, data quality, and burden. EVIDENCE OF UNDERREPORTING IN THE CE In many federal surveys, one can assess the quality of data by compari- sons with other sources of information. One of the difficulties in evaluating the quality of CE data is that there is no “gold standard” with which to compare the estimates. However, several sources provide insight into data quality. The National Research Council, in its review of the conceptual and statistical issues with the CPI, expressed concern about potential bias in the expenditure estimates from the CE. That report recommended comparison of the CE estimates with those from the Bureau of Economic Analysis’s Personal Consumption Expenditures (PCE): The panel’s foremost concern is with the extent of bias in the CEX [Con- sumer Expenditure Surveys] which, in turn affects the accuracy of CPI expenditure category budget shares. A starting point for evaluating house- hold expenditure allocations estimated by the CEX is to compare them against budget shares generated by other sources. The Bureau of Economic Analysis (BEA) produces the most obvious alternative, the per-capita and product accounts (NIPA). (National Research Council, 2002, p. 253)
OCR for page 70
70 MEASURING WHAT WE SPEND Comparisons Between the CE and PCE Compatibility A long literature has focused on the discrepancy between the CE and PCE data from the National Income and Product Accounts (NIPA) (Attanasio, Battistin, and Leicester, 2006; Branch, 1994; Garner, McClelland, and Passero, 2009; Garner et al., 2006; Gieseman, 1987; Meyer and Sullivan, 2011; Slesnick, 1992). However, in comparing the CE to the PCE data, it is important to recognize conceptual incompatibilities between these data sources. Slesnick (1992, p. 22), when comparing CE and PCE data from 1960 through 1989, concluded that “approximately one-half of the differ- ence between aggregate expenditures reported in the CEX [CE] surveys and the NIPA can be accounted for through definitional differences.” Similarly, the General Accounting Office (1996, p. 15), now the U.S. Government Accountability Office, in a summary of a Bureau of Economic Analysis comparison of the differences in 1992, reported that “more than half was traceable to definitional differences.” Thus, a key conceptual difference between the CE and PCE is “what is measured.” The CE measures out-of-pocket spending by households, while the PCE definition is wider, including purchases made on behalf of house- holds by institutions. The CE is not intended to capture purchases by house- holds abroad such as those on military bases, whereas the PCE includes these purchases. These differences are important and growing over time. Impu- tations including those for owner-occupied housing and financial services, but excluding purchases by nonprofit institutions serving households and employer contributions for group health insurance, now account for over 10 percent of the PCE. In-kind social benefits account for nearly another 10 percent. Employer contributions for group health insurance and workers’ compensation account for over 6 percent, while life insurance and pension fund expenses and final consumption expenditures of nonprofits represent almost 4 percent. McCully (2011) reported that in 2009 nearly 30 percent of the PCE was out-of-scope for the CE, up from just over 7 percent in 1959. Another important conceptual difference between the CE and PCE is the underlying data and how the estimates are constructed. Chapter 3 of this report describes the CE surveys in some detail. In comparison, the PCE aggregates come from data on the production of goods and services, rather than consumption or expenditures by households. The PCE depends on multiple sources, primarily from business records reported on the eco- nomic censuses and other Census Bureau surveys. The PCE numbers are the product of substantial estimation and imputation processes that have their own error profiles. Estimates from these business surveys are adjusted using input-output tables to add imports and subtract sales that do not go
OCR for page 71
WHY REDESIGN THE CE? 71 to domestic households. These totals are then balanced to control totals for income earned, retail sales, and other benchmark data (Bureau of Economic Analysis, 2010, 2011a,b). One indicator of the potential error in the PCE is the magnitude of the revisions that are made from time to time (Gieseman, 1987; Slesnick, 1992). A recent example is the 2009 PCE revisions, which substantially revised past estimates of several categories. Food at home, one of the largest categories, decreased by over 5 percent after the 2009 revision.1 Some authors have argued that despite the incompatibilities between the CE and PCE, the differences between the series should be expected to be relatively constant (Attanasio et al., 2006). While a plausible conclusion, a gradual widening of the difference between the sources could still be ex- pected given their growing incompatibility, as reported in McCully (2011) and Moran and McCully (2001). Comparisons Gieseman (1987) conducted one of the first evaluations of the cur- rent CE, comparing the CE to the PCE for 1980–1984.2 He found that the CE reports were close to the PCE for rent, fuel and utilities, telephone services, furniture, transportation, and personal care services. On the other hand, substantially lower reporting in the CE for food, household furnish- ings, alcohol, tobacco, clothing, and entertainment was apparent back in 1980–1984. The current patterns have strong similarities to those from 30 years ago. Garner et al. (2006) reported a long historical series of comparisons for the integrated data that begins in 1984 and goes up through 2002. Some categories compare well. Rent, utilities, and fuels and related items are reported at high and stable levels relative to the PCE. Telephone services, vehicle purchases, and gasoline and motor oil are reported at high levels (compared to the PCE) but have declined somewhat over time. Food at home relative to the PCE is about 0.70, but has remained stable over time. The many remaining categories of expenditures are reported at low levels relative to the PCE, though some small categories such as footwear and vehicle rentals show relative increases. 1 The 2008 value for food at home was $741,189 (in millions of dollars) prior to revision and $669,441 after, but the new definition excludes pet food. A comparable pre-revision number excluding pet food is $707,553. The drop from $707,553 to $669,441 is 5.4 percent. Appreciation is given to Clinton McCully (BEA) for clarifying this revision. 2 Comparisons of consumer expenditure survey data to national income account data go back at least to Houthakker and Taylor (1970). The issues were also addressed in a long series of articles comparing the CPI to the PCE deflators by Bunn and Triplett (1983) and Triplett and Merchant (1973).
OCR for page 72
72 MEASURING WHAT WE SPEND Garner et al. (2006) ultimately argued that this comparison should focus on expenditure categories whose definitions are the most comparable between the CE and PCE, noting “a more detailed description of the cat- egories of items from the CE and the PCE is utilized than was used when the historical comparison methodology was developed. Consequently, more comparable product categories are constructed and are included in the final aggregates and ratios used in the new comparison of the two sets of estimates” (Garner et al., 2006, p. 22). The new series provides compari- sons every five years from 1992 to 2002 (Garner et al., 2006), and were updated and extended annually through 2007 in Garner, McClelland, and Passero (2009). When using comparable categories and when the PCE aggregates are adjusted to reflect differences in population coverage between the two sources, the ratio of total expenditures on the CE to PCE is fairly high but still decreases over time. The ratio for 1992 and 1997 was 0.88, while in 2002 it was 0.84 and by 2007 had fallen to 0.81 (Garner, McClelland, and Passero, 2009). Figure 5-1 shows the time pattern for the ratio of CE to PCE spending for comparable categories over 2003–2009. The above discussion highlights that it is easy to overstate the discrepancy between the CE and the PCE by comparing all categories, rather than restricting the comparison to categories with comparable definitions (Passero, 2011). Separate Comparison of the Interview Survey Estimates and the Diary Survey Estimates with the PCE It is also important to look at comparability with the PCE of estimates from the Interview survey and Diary survey separately. Gieseman (1987) reported separate comparisons of the Interview survey and Diary survey estimates to PCE estimates for food because these were the only estimates available from both surveys.3 He found that Interview food at home ex- ceeded Diary food at home by 10 to 20 percentage points, but was still below the PCE. For what was then a much smaller category, food away from home, the Diary aggregate exceeded the Interview aggregate by about 20 percentage points. Again, the CE numbers were considerably lower than the PCE ones. It is not surprising that the Interview and Diary surveys yield different estimates, given the different approaches to data collection, including a 3 In these early years, BLS published separate tables for Interview and Diary data. In recent years, tables have been published with only integrated data. Consequently, subsequent com- parisons of CE to PCE almost exclusively rely on the integrated data that combine Interview survey and Diary survey data. In cases where the expenditure category is available in both surveys, the BLS selects the source for the integrated data that is viewed as most reliable. See Creech and Steinberg (2011) and Steinberg et al. (2010).
OCR for page 73
WHY REDESIGN THE CE? 73 1 CE to PCE Ratio 0.9 Total 0.8 Durables 0.7 Nondurables Services 0.6 0.5 2003 2004 2005 2006 2007 2008 2009 Year FIGURE 5-1 Coverage of comparable spending between the CE and PCE. NOTE: CE = Consumer Expenditure Surveys, PCE = Personal Consumption Expenditures. SOURCE: Passero (2011). Fig5-1.eps different form of interaction with the respondent household. These differ- ences provide the likelihood of differences in estimates between the two surveys as currently configured, as discussed in more detail later in this chapter. Bee, Meyer, and Sullivan (2012) looked further at comparing the es- timates from both surveys separately to the PCE. The authors examined estimates for 46 expenditure categories for the period 1986–2010 that are comparable to the PCE for one or both of the CE surveys. Table 5-1 shows the 10 largest expenditure categories for which these separate comparisons can be made, showing ratios of the CE to PCE for these categories. Among these categories, six (imputed rent on owner-occupied nonfarm housing, rent and utilities, food at home, gasoline and other energy goods, communi- cation, and new motor vehicles) are reported on the CE Interview survey at a high rate (relative to the PCE) and have been roughly constant over time. These six are all among the eight overall largest expenditure categories. In 2010, the ratio of CE to PCE exceeded 0.94 for imputed rent, rent and utilities, and new motor vehicles. It exceeded 0.80 for food at home and communication and is just below this number for gasoline and other energy goods. In contrast, no large category of expenditures was reported at a high rate (relative to the PCE) in the Diary survey that was also higher than the equivalent rate calculated from the Interview survey. Reporting of rent and utilities is about 15 percentage points higher in the Interview survey than the Diary survey. Food at home is about 20 percentage points higher in the
OCR for page 74
74 MEASURING WHAT WE SPEND TABLE 5-1 CE/PCE Comparisons for the 10 Largest Comparable Categories, 2010 Ratios PCE PCE Category ($ millions) Diary to PCE Interview to PCE Imputed rental of owner-occupied nonfarm 1,203,053 1.065 housing Rent and utilities 668,759 0.797 0.946 Food at home 659,382 0.656 0.862 Food away from home 545,579 0.519 0.506 Gasoline and other energy goods 354,117 0.725 0.779 Clothing 256,672 0.487 0.317 Communication 223,385 0.686 0.800 New motor vehicles 178,464 0.961 Furniture and furnishings 140,960 0.433 0.439 Alcoholic beverages purchased for off- 106,649 0.253 0.220 premises consumption NOTE: CE = Consumer Expenditure Surveys, PCE = Personal Consumption Expenditures. SOURCE: Bee, Meyer, and Sullivan (2012). Interview survey.4 Gasoline and other energy goods are about 5 percentage points higher in the Interview survey and communication is about 10 per- centage points higher. The 2010 ratios for food away from home and fur- niture and furnishings are close to a half for both the Interview and Diary surveys. For clothing and alcohol, the Diary survey ratios are below 0.50, but the Interview survey ratios are even below those for the Diary survey. The panel next looked at smaller expenditure categories that are com- parable between the PCE and the CE. Of the 36 such categories, only six in the Interview and five in the Diary have a ratio of at least 0.80 in 2010. In the Diary survey household cleaning products and cable and satellite television and radio services were reported with a high rate (comparable to the PCE). Household cleaning products had a ratio (relative to the PCE) of 1.15 in 2010 in the Diary survey; the ratio has not declined appreciably in the past 20 years. The largest of these categories reported with a high rate (comparable to the PCE) in the Interview survey were motor vehicle accessories and parts, household maintenance, and cable and satellite tele- vision and radio services. The remaining categories were reported at low 4 There is some disagreement about how to interpret the fact that food at home from the CE Interview survey compares more favorably to PCE numbers than does food at home from the CE Diary survey. Some have argued that the CE Interview survey numbers may include nonfood items purchased at a grocery store. Battistin (2003) argued that the higher reporting of food at home for the recall questions in the Interview component is due to overreporting, but Browning, Crossley, and Weber (2003) stated that this is an open question.
OCR for page 75
WHY REDESIGN THE CE? 75 rates (compared to the PCE) in both surveys with ratios below one-half. These include glassware, tableware, and household utensils and sporting equipment. Gambling and alcohol had especially low ratios, below 0.20 and 0.33, respectively, in both surveys in most years. Summary of Comparisons with the PCE The overall pattern indicates that the estimates for larger items from the CE are closer to their comparable estimates from the PCE. The current Interview survey estimates these larger items more closely to the PCE than does the current Diary survey. For the 36 smaller categories, neither the Interview survey nor the Diary survey consistently produces estimates that have a high ratio compared to the PCE. The categories of expenditures that had a low rate (compared to the PCE) tended to be those that involve many small and irregular purchases, categories of goods for specific family members (clothing), and categories for which individuals might want to underestimate purchases (alcohol, tobacco). Large salient purchases (like automobiles), and regular purchases (like rent and utilities) for which the Interview survey was originally designed, seem to be well reported. These patterns have been largely evident since the 1980s or even earlier. However, over the past three decades, there has been a slow decline in the level of reporting of many of the mostly smaller categories of expenditures in both the Interview survey and the Diary survey. Similar results are reported from Canada. Statistics Canada’s con- sumption survey was redesigned with both a recall survey and diary, with partial implementation in 2009. The level of expenditures from the diary was found to be 14 percent less than the recall interview for less frequent expenses and 9 percent less for frequent expenditures. Incomplete diaries contributed to the underestimation, given that 20 percent of diary days were “nonresponded” days (Dubreuil et al., 2011). The panel reiterates that there are many differences between the CE and the PCE, and it does not consider the PCE to be truth. Nevertheless, the most extensive benchmarking of the CE is to the PCE, so these results are informa- tive. Furthermore, when separate comparisons of the Interview survey and the Diary survey to the PCE are available, the comparisons provide an indi- cation of the possible degree of relative underreporting in the two surveys. Comparisons Between the CE and Other Sources There have been comparisons of the CE to a number of other sources. Most are summarized on the BLS Comparisons Web page.5 These compari- 5 See http://www.bls.gov/cex/cecomparison.htm.
OCR for page 76
76 MEASURING WHAT WE SPEND sons include, but are not limited to: utilities compared to the Residential Energy Consumption Survey (RECS); food at home compared to trade pub- lications Supermarket Business and Progressive Grocer; and health expen- ditures compared to the National Health Expenditure Accounts (NHEA) and the Medical Expenditure Panel Survey (MEPS). Some of the findings are presented below. The CE’s estimates for utilities are compared to those generated by the RECS. The populations of households from these two surveys are not iden- tical, but fairly consistent. The RECS collects most information on utilities directly from utility companies after obtaining permission from the sampled households. Between 2001 and 2005, the CE estimates of total expenditures for residential energy were between 7 and 9 percent higher than from the RECS. When the energy source was broken down, the CE was higher for electricity and natural gas, while lower for the smaller category of fuel oil and LP gas. In 2007, the CE’s estimate for total health expenditures was 67 percent of the total out-of-pocket health expenditures estimated from the NHEA. The NHEA is based on a broader population definition than is the CE, and the differences between its estimates and the CE may be affected by the population differences plus the concepts, context, and scope of data collec- tion. When compared to the MEPS, the CE estimates were lower for total health expenditures, with comparison ratios similar as those of the NHEA. Comparisons were made between total food at home from the CE with grocery trade association data from Supermarket Business and Progressive Grocer. During the 1990s, the CE estimate was consistently between 10 percent and 20 percent higher than the trade association data. Summary of Comparisons with Other Sources The panel was not charged with evaluating the error structure of the PCE or other relevant sources of administrative data. However, the above analysis provides important background for making decisions about the CE redesign. It indicates that the concerns about underreporting of expen- ditures in both the CE Diary and CE Interview surveys are warranted. For many uses of the CE, any underreporting is problematic. However, for the use in calculating CPI budget shares, the differential underreporting that is strongly indicated by these results, and discussed in more detail on p. 105 of Chapter 5, “Disproportionate Nonresponse,” is especially problematic. In principle, an attentive, motivated respondent could report a particular expenditure—a pound of tomatoes for a certain price—concurrently with better accuracy than in a recall survey. This potential is not evident from the estimates of aggregate spending obtained from the current designs of the CE Interview and Diary surveys. The above analysis indicates that there
OCR for page 77
WHY REDESIGN THE CE? 77 are issues with both the CE Diary and CE Interview surveys, leading to the need for them to be assessed and redesigned. As a result, the panel reached this conclusion: C onclusion 5-1: Underreporting of expenditures is a major quality problem with the current CE, both for the Diary survey and the In- terview survey. Small and irregular purchases, categories of goods for specific family members, and items that may be considered socially undesirable (alcohol and tobacco) appear to suffer from a greater per- centage of underreporting than do larger and more regular purchases. The Interview survey, originally designed for these larger categories, appears to suffer less from underreporting than does the Diary survey in the current design of these surveys. MEASUREMENT DIFFERENCES BETWEEN THE INTERVIEW AND DIARY Before examining potential sources of response errors in the Interview survey and Diary survey separately, this section considers whether these two independent surveys, as currently designed, are inherently comparable in the information that each collects. In the section above, the panel raised its concern about basic comparability of expenditure categories when compar- ing to the PCE. Here, the report explores another aspect of comparability. It is important to remember the purposes for which the two surveys were originally designed. The Diary was designed to gather information on the myriad of frequent, small purchases made on a daily basis. These items include food for home consumption and other grocery items such as household cleaning and paper products. The Diary also is the source of expenditures for some clothing purchases, small appliances, and relatively inexpensive household furnishings, as well as the source of estimates on food away from home. The Interview, on the other hand, was designed to produce estimates for regular monthly expenditures like rent and utilities. It was designed to capture major expenditures, including those for large ap- pliances, vehicles, major auto repair, furniture, and more expensive clothing items. Given the very different purposes of the two surveys, it is not surpris- ing that they have entirely different designs and, hence, different problems. Differences in Questions, Context, and Mode A broad base of literature in survey research has identified many factors that can independently affect the accuracy of answers to survey questions. Some of the most important include the following:
OCR for page 78
78 MEASURING WHAT WE SPEND • Different question wording is likely to produce different responses (Groves et al., 2004). • The context in which questions are asked—for example, the pur- pose of the survey as it is explained to the respondent and the order in which questions are asked—influences what respondents will report (Tourangeau and Smith, 1996; Tourangeau, Rips, and Rasinski, 2000). • Survey mode influences answers. For example, the literature dem- onstrates that in-person interviews are more likely to produce so- cially desirable answers that put the respondent in a more favorable light (Dillman, Smyth, and Christian, 2009). • For self-administered diaries, the visual layout and design can have a dramatic effect on respondent answers (Christian and Dillman, 2004; Tourangeau, Couper, and Conrad, 2007). These influences are realized as respondents go through the well- established cognitive process of comprehending the question and conclud- ing what they are being requested to do, retrieving relevant information for formulating an answer, deciding which information is appropriate and adequate, and reporting. It is well documented that errors may occur at each of these stages (Tourangeau, Rips, and Rasinski, 2000). As noted earlier, Bee, Meyer, and Sullivan (2012) found that the level of reported expenditures for certain purchases are consistently different in the Interview survey and the Diary survey. Although the Interview survey generally yields larger expenditures, these differences are not consistently in the same direction. For example, food purchased away from home, pay- ments for clothing and shoes, and purchases for alcoholic beverages are greater from the Diary. Expenditures for rent and utilities, food at home, and gasoline and other energy goods are larger from the Interview. Some argue that larger is simply more accurate, but that may not be the case. The panel has not said that either approach or type of question is inherently better or worse. However, it is appropriate to illuminate these differences more closely. Different questions are asked in the Interview and the Diary surveys, and these different questions are also asked in different survey contexts. To illustrate this, consider the category of food and drink at home to see how each survey collects this information. This is one of the categories for which the Diary was designed. The Interview survey asks the following questions: • What has been your or your household usual WEEKLY expense for grocery shopping? (Include grocery home-delivery service fees and drinking water delivery fees.)
OCR for page 98
98 MEASURING WHAT WE SPEND General Instructions Fill out this diary for an entire week, writing down EVERYTHING you and the people on your list spend money on each day – the products you buy, the services you use, the household expenses you have during the week – no matter how large or small they are. We recommend that you record your expenses each day. Think about where you went and what you’ve done. Talk to the people on your list every day to find out how they spent their money. Include payments by: Cash Credit/Debit Card Automatic Withdrawal/Payroll Deduction Check Money Order Store Charge Card Food Stamps WIC Voucher Grocery Certificate Keep receipts and other records so that you will remember to record what you bought or paid for. Use the pocket at the back of the diary to store them. Some record types include: Receipts Bank Statements Catalog/Internet Order Invoices Utility Bills Telephone Bills Credit Card Statements Pay Stubs Include items that you bought for people who are not on your list, such as gifts. Refer to the flap Refer to the flap attached to the attached to the front cover for back cover for answers to Examples of Expenses. Frequently Asked Questions. Do NOT record: ♦ Expenses of people on your list while they were away from home overnight. ♦ Business or farm operating expenses ♦ Sales tax for: Part 2. Food and Drinks for Home Consumption Part 3. Clothing, Shoes, Jewelry, and Accessories Part 4. All Other Products, Services, and Expenses 2 FORM CE-801 (7-1-2005) FIGURE 5-5 Diary booklet, pageFig5-5.eps 2. SOURCE: Bureau of Labor Statistics.
OCR for page 99
WHY REDESIGN THE CE? 99 Recording Expenditures May Be Problematic The diary instructions focus on recording expenditures each day during the diary week, separately for different categories. A total of 28 pages of the booklet are laid out by “day” and consist of labeled tables for recording household expenditures made in each of four categories: 1. Food and Drinks Away from Home 2. Food and Drinks for Home Consumption 3. Clothing, Shoes, Jewelry, and Accessories 4. All Other Products, Services, and Expenses Nine additional pages are included for reporting any information that will not fit on the individual day pages. The diary-keeper is to record each expenditure on the correct page by day and expenditure category. For each item recorded, the form asks for a description of the item, the cost, plus additional information that differs for each of the four expenditure categories. For example, the tables for Cloth- ing, Shoes, Jewelry, and Accessories ask for additional information on the individual for whom the item was purchased: gender, age, and whether the person was a member of the household. The tables for Food and Drinks Away from Home also ask where the item was purchased, whether the item included alcoholic beverages, and a breakout of the cost of any such alcohol. Research has shown that respondents are influenced by much more than words on how to complete questionnaires; a mostly linear path with guidance from numbers, graphics, and symbols also helps to instruct re- spondents on how a questionnaire (or diary) is designed to be completed (Dillman, Smyth, and Christian, 2009). The current in-person delivery and retrieval process is designed to compensate for some of these problems. The field representative may look at the receipts collected by the household respondent and ask other questions to find out whether all expenditures have been recorded. Realization at the initial visit that the interviewer will call mid-week and return to pick up the diary at the end of the week would seem to encourage respondents to think about their daily expenditures and be able to recall and report them at the end of the diary week contact. However, this mitigation process is not always followed. Other general problems can occur with the diary. Some shopping trips require complex reporting of many and varied items on the diary forms. For example, a major grocery-shopping trip may take considerable time and effort to record. Each item purchased must be itemized separately with a description and cost. (The respondent has to figure out whether to record before or after the food is put away.) Receipts are often limited to abbrevia-
OCR for page 100
100 MEASURING WHAT WE SPEND tions and codes that may not be understandable to the person who made the purchase and/or is completing the diary. Part of the diary placement visit is for the field representative to “size up” the respondent as to whether he or she understands how to complete the diary and seems committed to do so. If the respondent does not appear to understand the instructions and the use of the daily recording forms, some field representatives will revert to an alternative approach and ask such a respondent to merely keep all of the household receipts for the week’s expenditures in a pocket of the inside back cover. On the next visit a week (or two weeks) later, the field representative and the respondent will go through the receipts and fill in the diary forms together. The panel learned that this approach is likely used for a significant number of house- holds in which the respondent finds keeping the diary too difficult to do. C onclusion 5-13: It is likely that the current organization of recording expense items by “day of the week” makes it more difficult for some respondents to review their diary entries and assess whether an expen- diture has been missed. Reporting Period for the Diary Survey The Diary survey has a one-week reporting period, followed imme- diately by a second wave also consisting of a one-week reporting period. The Diary survey was conceived as a vehicle for collecting smaller and fre- quently purchased items that were unlikely to be reported accurately over a three-month recall period. However, in practice, the Diary collects a wide variety of expenditure items. Since many types of expenditures are made infrequently, and others are not purchased in the same amount each week, Diary expenditure estimates for these variables are likely to be more vari- able than those from the Interview survey with its three-month reporting period. For example, Bee, Meyer, and Sullivan (2012) found that for 2010 the weighted average coefficient of variation of spending reports on 35 cat- egories of expenditures common to both the Interview and Diary was nearly 60 percent higher for a typical Diary response than for a typical Interview response.9 One reason is that, in 2010, close to 10 percent of weekly diaries that were considered as valid observations reported no in-scope spending at all. (About 75 percent of these reports were out-of-scope because the fam- ily was on a trip for the week.) Consequently, a larger number of weekly diaries is required to equal the precision of the quarterly interviews. 9 Their comparisons adjusted for the different sample sizes of the Diary and Interview sur- veys. The coefficient of variation is the standard error of a mean or other statistic expressed as a percentage of the statistic.
OCR for page 101
WHY REDESIGN THE CE? 101 A related conceptual issue is that the short reference period in the cur- rent Diary survey may be too short to accurately measure an individual’s normal spending pattern. While these errors may average out in the calcula- tion of means, several important uses of the CE require the measurement of the distribution of spending. Proxy Reporting in the Diary Survey Respondents are asked to consult with other members of the household during the week and to report expenditures for all members. The field rep- resentative lists the names of the members on the inside cover of the diary. These instructions are aimed at encouraging communication between the person who agrees to complete the diary and others in order to facilitate accurate reporting. As stated earlier, the diary mode does provide more opportunity to confer with other household members over the week than there is within a rushed recall interview. However, there are still issues with proxy reporting in today’s households. The changing structure of U.S. households, in which the adults in that household are more likely to have separate incomes and expenditure pat- terns, means that unless deliberate communication occurs expenditures may be underreported. In addition, household members are not always open with each other about what they have purchased or how much it cost. If a member of the household does not want the person most responsible for completing the diary to know of the expenditure or its cost (e.g., a teenager downloading a new video game, the cost of an anniversary gift, or payment of a parking ticket), the diary will probably miss the expense. C onclusion 5-14: Although the diary protocol encourages respondents to obtain information and record expenditures by other household members during the two weeks, it is unclear how much of this happens. NONRESPONSE Comparison of Response Rates In calculating response rates on the CE Interview survey, BLS uses out- come information from each household for each wave (waves two through five) as independent observations in the survey. For the Diary survey, BLS counts each week of the two weeks of diary reporting by a household as an independent observation. The “CE program defines the response rate as the percent of eligible households that actually are interviewed for each survey” (Johnson-Herring and Krieger, 2008, p. 21). These calculations exclude cases where the household is ineligible.
OCR for page 102
102 MEASURING WHAT WE SPEND BLS sometimes refers to the rates described by Johnson-Herring and Krieger as the “collection rates” since they are computed by the Census Bureau immediately following data collection. In post-collection processing, BLS removes some data records because they are found to have limited data entries and expenditure dollar amounts. BLS then recalculates the response rates, and sometimes refers to these adjusted rates as “estimation rates.” Johnson-Herring and Krieger do not discuss these adjustments in their description of methodology. In comparing the “collection rates” with the “estimation rates” one sees that the adjustments affect the Diary rates more than the Interview rates. BLS generally provides the “estimation rates” as the response rates to their microdata users. After some consultation with BLS, the panel has concluded that the adjusted “estimation rates” more closely describe the usable response, and has decided to use those as the response rates for the purpose of this report. Thus, as reported in Chapter 3, the CE Interview survey had a response rate (estimation rate) in 2010 of 73 percent, slightly ahead of the Diary survey, which had a response rate of 72 percent. Both surveys have experienced declines in response rates over time, a problem that has plagued most government household surveys. The re- sponse rates (estimation rates) for the Diary survey have been slightly lower than those for the Interview survey (see Figure 5-6). The CE is a burden- some survey, and the overall response rate is lower than several well-known but less burdensome surveys such as the Current Population Survey (92%) and the CPS Annual Demographic Survey (80% to 82%). 78 76 74 Percent 72 Interview Diary 70 68 66 2005 2006 2007 2008 2009 2010 Year FIGURE 5-6 Response rates (estimation rates) for Consumer Expenditure Interview and Diary surveys. Fig5-6.eps SOURCE: Bureau of Labor Statistics, data table provided to panel.
OCR for page 103
WHY REDESIGN THE CE? 103 However, the CE’s response rate is comparable to consumption surveys in other countries, which have experienced similar declining response rates during the period between 1990 and 2010. Figure 5-7 depicts the response rate to the CE (United States) compared to response rates for compa- rable consumption surveys in Australia, Canada, and the United Kingdom (Barrett, Levell, and Milligan, 2011). The CE response rate is somewhat higher than the others. Panel Attrition Given that the CE Interview survey uses a rotating panel design with five waves of data collection per panel, an additional response concern relates to attrition over the life of a panel (Lepkowski and Couper, 2002). King et al. (2009) studied the pattern of participation in the CE. In this study, they looked at a single cohort first interviewed in April–June 2005. Among the households that completed the first wave interview, 78.6 per- cent were classified as complete responders (data for all five waves were captured), 14.1 percent were classified as attritors (completed one or more FIGURE 5-7 Response rates for consumption surveys across four western countries. SOURCE: Barrett, Levell, and Milligan (2012). Fig5-7.eps bitmap
OCR for page 104
104 MEASURING WHAT WE SPEND interviews before dropping out), and 7.3 percent were classified as inter- mittent responders (completed at least one but not all waves of data col- lection). It is not clear whether the response rates reported above are for all cohorts in a particular quarter (i.e., averaging across different panels), but the drop-off after the first wave of data collection raises both concerns and opportunities. A key concern is that the decision to attrite may be influenced by the experience of the first wave, and that experience may be different depending on the number and types of expenditures reported in that wave. If this is the case, the attrition can potentially affect the estimation of expenditures during later waves. In other words, to what extent is later wave participa- tion affected by reported expenditures (or expenditure patterns) in the first interview, holding constant the usual demographic variables that are used in weighting adjustments? The panel is not aware of research exploring this potential source of bias. A key opportunity for the future: BLS can use households that provide partial data (i.e., attritors or intermittent responders), along with their level and pattern of expenditures, in the adjustment for missing waves. It is the panel’s understanding that the nonresponse adjustments employed by BLS use post-stratification adjustment to population controls in a cross-sectional adjustment and do not make use of the household’s expenditure data from other interviews. While this may be hard to do given the need to produce quarterly estimates, the panel nature of CE interview data provides a rich source of information to better understand patterns of nonresponse and potential nonresponse bias. Such information can also be used to target ef- forts to minimize nonresponse error in responsive design strategies (Olson, 2011). The Diary survey also incorporates a panel design that interacts with the issue of attrition. Each selected household is asked to complete two waves of data collection, each wave being a one-week diary. Even though the waves are adjacent weeks from the same households, the estimation process considers each wave as an independent collection. A general is- sue with diary surveys is that compliance tends to deteriorate as the diary period progresses, that is, there is high compliance at the beginning of the period and less toward the end. The panel has not seen specific research on this issue for the CE Diary, but it is likely that there is less compliance during the second wave than in the first wave. This may be particularly true in households in which both diaries are placed at the same time without an intervening visit from the field representative. Without adjustment during the estimation process, it is possible that the lower reported expenditures in wave 2 will bring down the overall level of expenditures reported from the Diary survey.
OCR for page 105
WHY REDESIGN THE CE? 105 Disproportionate Nonresponse Having a high proportion of the initially sampled individuals respond to the survey may lead to higher quality data; however, it is neither a suf- ficient nor necessary condition. King et al. (2009) reported on four studies to examine potential nonresponse bias in the CE. One of these studies was discussed above. Even though the studies showed that the nonresponse was not missing completely at random (African Americans are underrepresented among the respondents, and those over 65 years old are overrepresented), they did not find measurable bias in estimating total expenditures due to the nonresponse. A more recent study suggests a potential bias related to underrepre- sentation of the highest income households (Sabelhaus et al., 2011). The authors began by comparing CE estimates of income and the distribution of income with other relevant data sources such as the Current Population Survey, the Survey of Consumer Finances, and tax return–based datasets from the Statistics of Income. These comparisons show that the CE has relatively fewer completed surveys from households with income $100,000 or greater. The authors also showed that the average income estimated per household for this highest income group is substantially below the esti- mated average from these other sources.10 The authors demonstrated by comparing the CE sample to geocoded tax data that higher income units are less likely to respond on the CE and are underrepresented even after weighting, while units at lower levels of income are adequately represented. While there is not a large difference in the total population counts, the underrepresentation of the upper income groups could lead to an undercount of income in the higher income levels and consequently also to understating the aggregate level of spending. They have concern because these high-income households may have a different mix of expenses when compared to other households. This differential could affect the relative budget shares calculated for the CPI. The authors speculate that a significant portion of the difference between CE aggregate spending and PCE spending might be accounted for by the nonresponse of higher income consumer units. They concluded that: Only the very highest income households seem to be under-represented in the Consumer Expenditure Survey (CE), and the mystery of overall under- reported spending in the CE is not fully explained by that shortcoming. At least some of the shortfall in aggregate CE spending seems attributable 10 Greenlees, Reece, and Zieschang (1982) carefully analyzed nonignorable income nonre- sponse (nonresponse related to the variable being imputed) using matched microdata from the CPS and IRS. They demonstrated clearly the problem that nonignorable nonresponse imposes.
OCR for page 106
106 MEASURING WHAT WE SPEND by under-reported spending by at least some CE respondents. (Sabelhaus et al., 2012, p. 21) C onclusion 5-15: Nonresponse is a continuing issue for the CE as it is for most household surveys. The panel nature of the CE is not suf- ficiently exploited for evaluating and correcting either for nonresponse bias in patterns of expenditure or for lower compliance in the second wave of the Diary survey. Nonresponse in the highest income group may be a major contributing factor to underestimates of expenditures. ISSUES REGARDING NONEXPENDITURE DATA The use of the CE data for economic research provides the impetus for collecting data on demographics, income, investment, and savings at the household level in the CE. The panel identified several issues in the current CE with these types of data that make the research process more difficult. Reporting Periods for Income, Employment, and Expenditures Ideally, researchers would like to have expenditures, income, and em- ployment for responding households collected over the same reporting periods. The current Interview survey collects expenditure information for each of four quarters during a year. Income and employment information is collected for the previous 12 months, but only during the second and fifth interview. The current Diary survey collects expenditure data for two weeks, but income and employment data for the previous 12 months. The inconsistency of the collection periods for these different types of data can make it difficult to reconcile large differences in expenditure and income at the household level when these differences occur. While researchers have expressed the importance of having all data (including expenditures and income) collected over the same reference period, some panel members have expressed the opinion that it is also important to allow respondents to report for a period for which they can do so most accurately. Demographics and Life Events Examining the impact on household spending due to a variety of stimuli is important in the economic research done using the CE data. When changes occur, it is difficult to reconcile changes in expenditure and income without information about whether an individual household has undergone a major life event (e.g., marriage, divorce, or change in employment status) sometime during the year. The current CE collects relatively limited infor-
OCR for page 107
WHY REDESIGN THE CE? 107 mation on these major life events. For example, the CE Interview survey does not collect changes in job status (and the reason for those changes) between survey waves. Linking with Administrative Data Sources The ability to link CE data to relevant administrative data sources (such as IRS data or data on program participation) could provide additional richness for economic research as well as providing potential avenues to in- vestigate the impact of nonresponse on the survey results. Data confidential- ity procedures have presented barriers to such linkage. Some success in this area has been achieved by some other federal surveys that ask respondents’ permission to match their survey responses with administrative data. Some surveys have experimented with an “opt out” version, where respondents can say that they do not want the matching to occur (Pascale, 2011). These strategies would be useful to try for the CE. C onclusion 5-16: For economic analyses, data on income, saving, and employment status are important to be collected on the CE along with expenditure data. Aligning these data over time periods, and collecting information on major life events of the household, will help researchers understand changes in income and expenditures of a household over time. Linkage of the CE data to relevant administrative data (such as the IRS and program participation) would provide additional richness, and possibly provide avenues to investigate the effect of nonresponse. SUMMARY OF REASONS TO REDESIGN THE CE This chapter specifically addresses the issues upon which the panel bases its recommendations in Chapter 6 to redesign the CE. The CE surveys appear to suffer from underreporting of expenditures. This conclusion is based on comparison of the CE estimates to several sources, but primar- ily to the PCE. The panel does not consider the PCE as “truth,” but does consider the results informative. The comparisons were made considering categories of expenditures with comparable definitions between the CE and PCE. The overall pattern indicates that the estimates for larger items from the CE are closer to their comparable estimates from the PCE. The current Interview survey estimates these larger items more closely to the PCE than does the current Diary survey. For 36 smaller categories, neither the Inter- view survey nor the Diary survey consistently produces estimates that have a high ratio compared to the PCE. Thus, the panel concluded that there are underreporting issues with both the CE Diary and CE Interview surveys and
OCR for page 108
108 MEASURING WHAT WE SPEND proceeded to review response and nonresponse issues that could contribute to this underreporting. Before examining sources of potential response errors in the Interview survey and Diary survey separately, the panel observed that the mode, questions, and context used in the Interview survey and Diary survey are quite different. Therefore, one ought to expect differences, both in issues that need to be addressed and in the estimates obtained. It is therefore not surprising that different amounts are reported in the Interview and Diary in this situation, and that these differences are not always in the same direction. The panel examined potential sources of response error in both surveys. They concluded that both the Interview and Diary surveys have issues with: • motivation of respondents to report accurately, • structure of data collection instruments that leads to reporting problems, • recall or reporting period, and • proxy reporting. Additionally, they expressed concern about the infrequent use of records in the Interview survey that is less relevant to a concurrent mode of collection. The Interview and Diary surveys have similar response rates of 73 to 72 percent. These rates are lower than for some important federal surveys, but appear to be better than consumer expenditure surveys in some other west- ern countries. There is concern about attrition within the panel designs for both surveys, as well as concern about the effect of disproportionate non- response from the segment of the population in the highest income groups. In sum, there are response and nonresponse issues with both the concur- rent (Diary) and recall (Interview) collection of data in the CE as currently implemented. The panel does not conclude that one method is intrinsically better or worse than the other. However, it does believe that different ap- proaches to these methods have the potential to mitigate these problems.