Data Collection for CPI Construction
The data used to calculate most Consumer Price Index (CPI) subindexes originate from three different, though interrelated, sample-based sources: the Consumer Expenditure Survey (CEX), the Point of Purchase Survey (POPS), and the Commodities and Services (C&S) Survey.1 These surveys have evolved through time, and one of them, the CEX, has multiple uses in the government statistical system. Moreover, the data collection structure itself influences what indexes can and cannot be produced. For example, the current data system does not allow for production of non-urban-area indexes or regional price-level comparisons; nor does it support accurate price indexes for subpopulations such as the elderly, minorities, or the poor, particularly at subnational levels. Also, in order to reduce respondent burden, households are only asked about a portion of CPI item categories, which also inhibits the construction of some potentially useful, alternatively weighted (e.g., democratic) indexes.
There are two distinct approaches one can take when considering how the data underlying CPI computation might be upgraded. One is to assume that the basic data collection structure will remain as it is and then to seek ways of improving each of the survey components. Another is to redesign the entire data collection structure so that it reflects advances in data collection technology and so that the data collected are more consonant with the ultimate computation of the CPI. This second option would require a transition plan that takes the data system
from where it is today to where we aspire it to be. In this chapter, we discuss these two options.
THE CURRENT DATA COLLECTION PROCESS
The Consumer Expenditure Survey
The CEX is the primary tool for establishing CPI weights at the basic (218) item level. It is the most comprehensive source of combined household income and expenditure data produced by the statistical system; it is also very expensive to conduct. Nonetheless, a growing consensus is emerging among policy researchers that improvements should be made to the CEX. Probably the most frequently voiced criticism has been that the sample size is too small for the survey to be used for the range of applications to which it is currently put. However, another shortcoming of equal or greater importance—at least in the context of CPI construction—stems from nonsampling-related inaccuracies, such as survey response bias. Suggestions for improving the survey’s questionnaire design and substantive scope can be found in the research literature; in this section we review recent recommendations for upgrading the CEX, after first posing several questions that must be answered before a fully informed decision to change the survey can be made.
The panel’s foremost concern is with the extent of bias in the CEX which, in turn, affects the accuracy of CPI expenditure category weights. A starting point for evaluating household expenditure allocations estimated by the CEX is to compare them against weights generated by other sources. The Bureau of Economic Analysis (BEA) produces the most obvious alternative, the per-capita personal consumption expenditures (PCE) data, as part of the national income and product accounts (NIPA). During its postsurvey evaluation program, designed to identify areas in which the CEX could be improved, the Bureau of Labor Statistics (BLS) does compare the expenditure pattern of the CEX with that shown in the PCE component of the NIPA (Branch, 1994). Such comparisons might, depending on the outcome, raise a second question: Why not use, for the national CPI, upper-level weights derived from aggregate-level data, such as the PCE?2
In considering such an option, one must (1) judge whether or not the PCE weights are really superior for this application and (2) if they are, determine whether the CEX would still be needed for the CPI program. The answer to the second question depends in part on the value placed on area and group indexes, which could not be constructed using NIPA data. Budgetary considerations aside, there is no inherent reason why BLS could not produce a flagship CPI using NIPA-based upper-level weights, while producing other indexes based on the CEX.
Let us first address the accuracy issue. Branch (1994) provides a comparison of CEX and PCE expenditure categories for the period 1992-1995. The comparison is limited to the universe of categories that are comparably defined; this leaves out two major ones—owner-occupied housing and health care. For a few categories, such as rental rates, utilities, and vehicle purchases, the correspondence ratio (CEX weight divided by PCE weight) is near 1.00, which is what one would generally hope for. For equivalent rent of owner-occupied dwellings, the CEX expenditure weight is much larger than shown in the PCE data, by a ratio of almost 2 to 1. For all other categories, though, the CEX expenditure weight is much smaller, and many are in the 0.4-0.6 range. This discrepancy calls into question the accuracy of the CEX weights. There is also a problem (documented in Triplett, 1997) that the total expenditure of households implied by CEX and PCE weights is drifting further apart, perhaps by as much as 1 percent a year. One should not jump to the conclusion that these differentials imply an accurate PCE and an inaccurate CEX, but the wide discrepancies clearly warrant further investigation since both sets of expenditure weights cannot be correct.
What is known about the relative strengths of the PCE and the CEX data? For certain types of expenditure categories, well-documented sources of household response error damage the credibility of CEX weights. Triplett (1997:15) states that “reporting biases are known to be serious in some consumer expenditure components.” For instance, households may underreport “vice” products such as alcohol or tobacco—for 1995, the ratio of CEX to PCE expenditure shares on alcoholic beverages was a dismal 0.34. In addition, survey respondents often fail to accurately recall the volume or timing of some frequently purchased items: for example, “other entertainment” has a correspondence ratio of 0.37, and “miscellaneous” has a ratio of 0.24. For a number of other categories, such as furniture or appliances, it is less clear why expenditure weights differ as sharply as they do between the PCE and CEX. (The ratio of CEX to PCE weights for the “household furnishings and equipment” expenditure category was 0.65-0.66 for the 1992-1995 period.) Here the problem may involve the PCE as much as the CEX. Businesses and governments buy furniture and appliances, but do not necessarily report (or categorize) these purchases in their accounting systems in a consistent way that allows them to be accurately identified and reported. The fact that other items (e.g., books, televisions, sound equipment) that are purchased broadly by both households and businesses—and for which it makes no obvious
sense for households to underreport—show large weight differentials that might support this notion (Triplett, 1997). For other components, such as rent or auto purchases, for which reporting rates are known to be high, it is encouraging that the ratios of CEX to PCE weights are close to 1.
There are ways in which the PCE data system appears more developed. The PCE has the advantage that it is based on large surveys of businesses (the most prominent being the Census Bureau’s Retail Trade Surveys) that generally keep careful records and that rely less on respondent memory than does the CEX. Triplett (1997:16) does note a “birth bias” in the establishment surveys that arises because there is no mechanism for bringing new businesses into the sampling frame quickly. However, data from the censuses of manufacturers, retail trade, and service industries allow PCE component weights to be revised periodically and benchmarked every 5 years, which surely corrects some reporting and other biases. The benchmarking resets the allocation of purchases by commodity among business, government, and households and updates commodity lists. Furthermore, the BEA methodology for keeping track of inputs and outputs includes cross-checks that impose consistency on the data.
A major advantage of the CEX weights is that they are derived directly from reported household expenditures. One benefit of this direct reporting is that it allows household characteristics to be linked to expenditure information and, in turn, subpopulation indexes such as the CPI-E and CPI-W to be calculated. To produce the PCE weights, business and government spending must be subtracted out of sales data. Thus, the PCE is an indirect measure, calculated residually as final goods and services minus purchases made by nonconsumer sectors. Triplett (1997:16) notes that it is especially difficult to calculate consumption shares at more refined item levels because sales to consumers are not always distinguishable from sales to businesses and government: “The finer the level of detail, the more likely that the long chain of computations necessary to reach the CPE’s indirect estimate of consumer spending will have cumulative errors that affect the totals.” Even so, it seems implausible that estimates of business purchases of consumers goods could be off by enough to generate the kind of ratios between NIPA and CEX weights that are now produced.
Difficulties associated with separating business from consumer purchases are compounded by the fact that the PCE covers a wider scope of goods and services than does the CEX. For instance, PCE coverage includes elements of government consumption, such as Medicare and Medicaid, the employer-paid portion of medical insurance, financial services, expenditures by nonprofit institutions, and the value of certain goods and services received in kind by households (Clark, 1999). As discussed throughout this report, the CPI currently covers only out-of-pocket expenditures by urban households. All told, about 25 percent of PCE spending is not reflected in the CPI. This, in itself, redistributes expenditure shares substantially. For instance, the medical care category (since it is not limited to out-of-pocket expenditures) gets a much higher weight in the PCE—
17.6 percent for 1998—than it does in the CPI—5.6 percent for 1998. Also, not all items are defined comparably: in the CEX, for example, expenditures on new cars net out any amount paid for a trade-in vehicle; the PCE tracks the gross amount paid for vehicles, and trade-ins are not taken into account (Clark, 1999).
This imperfectly matched expenditure classification creates a major hurdle to producing a PCE-weighted CPI, though Branch (1994) was able to make adjustments to reduce the noncomparability. For instance, utilities in the PCE can be combined with rent in order to match rent as it is defined in the CEX. But the following categories could not be reconciled for purposes of comparison: home-owner shelter (owners’ equivalent rent, as we noted earlier, has recently accounted for around 20 percent of expenditures in the CEX and only around 11 percent in the PCE), capital improvements, health care, insurance, and finance charges (Branch, 1994). Assuming that the basic CPI item structure will remain as is, it is not clear how this problem should be resolved. To maintain the current CPI scope, the additional PCE entries would need to be backed out. Furthermore, BEA actually uses CEX data to estimate expenditure for a small number of commodities—personal computers, vehicle rentals, day care (Triplett, 1997)— which is another reason why moving to PCE weights might not allow the CEX to be eliminated.
On the basis of available evidence, it is unclear whether PCE or CEX weights are superior. What is clear, though, is that for some components the two systems produce very different results. The major hurdle inhibiting comparisons among indexes weighted using alternative source data is the lack of uniformity in the scope and definition of goods and services covered. It is an open question as to how accurately expenditure categories can be mapped from the PCE to the CEX. We are not in a position to advocate one set of weights over the other, but the question certainly warrants further investigation—and this is what we recommend in the final section of this chapter.
The CEX is used by BLS to determine the base period household expenditure shares for each of the 11,772 basic CPI strata. The CPI has traditionally determined these quantities from a 3-year span of CEX data; current weights reflect expenditure shares calculated from the 1993-1995 surveys, with immediately prior weights based on the 1982-1984 surveys. In 1998 BLS announced that it would update and apply 1999-2000 expenditure weights effective January 2002 and revise these weights every 2 years, instead of roughly every 10, as has been its prior practice (see the “Technical Notes” at the end of the chapter for additional details about the CEX). To accomplish this objective—which necessitates combining only 2 years of survey data instead of 3 and increasing the per-year number of basic CPI strata for which quantity information is obtained—and to maintain roughly the current level of statistical accuracy, the sample must be
increased by 50 percent. The recently requested increase in the CEX sample size from an effective annual sample size of 5,870 to about 7,500 per year approximately does this.
The decision to update CPI-U and CPI-W expenditure weights every 2 years beginning in 2002 was based on a tradeoff between timeliness and concern about “chain drift,” which can occur when the price indexes of non-identical items must be linked.3 BLS agreed with critics (such as Boskin et al., 1996) that the weights should be updated more often than every decade or so as in the past, but little theory or empirical evidence existed to provide guidance on the optimal frequency of updates. BLS chose to move to the more frequent end of the spectrum, every 2 years. There were some operational issues that argued for not updating every year. For instance, BLS reports that there is an advantage to having “off years” in which changes in CEX forms can be implemented without the time pressure of employing the data in the CPI. The main reason, however, was simply that the approach of updating weights every year, which would require overlapping 2-year CE weights, was untested and its statistical properties were uninvestigated. BLS noted that, in its experience, changing index formulas can produce unexpected and undesirable results, so it decided to err on the side of caution by not going to annual updating.
The CEX targeted sample sizes are 6,160 per quarter for the Quarterly Interview Panel Survey and 5,870 per year for the Diary Survey. Because an increased sample size will produce an increase in the precision of an unbiased estimate, recommendations to increase the CEX sample size (primarily directed at the Diary Survey) have tended to be of the “more is better” variety. However if, as we have pointed out, the weights from the CEX are not unbiased, a decrease in sampling variability might actually increase mean squared error, which is what we ultimately care about.
A recent report from the Conference Board recommended increasing the annual sample size of the CEX “perhaps initially to 30,000 households.” The
recommendation was supported with the following comments (Conference Board, 1999:18):
This [the currently proposed increase in the sample from 11,000 to 15,500] seems inadequate. The comparable Canadian survey, to cover an economy only one-tenth our size, is reported to cover 16,000 households. Sampling theory shows that optimum sample size is not proportional to the overall size of the economy. Nevertheless, we feel that a larger U.S. Consumer Expenditure Survey will achieve a worthwhile gain in the accuracy of the weights used in the CPI. A larger [CEX] sample will also improve other important statistics, such as poverty data.
While the intuition of the members of the Conference Board group might be right, their recommendation would have been weightier had they been able to offer a statistically based explanation of how they concluded that the benefit of increasing the sample to 30,000 consumer units would be “worthwhile” and that the value of gains in accuracy from increasing the sample would likely offset the cost of expanding the CEX.4 Likewise, no indication is given as to why and by what measures, in the group’s estimation, the planned 15,000-15,500 household sample is unacceptably inaccurate. The Conference Board did note two factors underlying its recommendation that the CEX should be larger: that it would be valuable for improving other national statistics (besides the CPI) that are based in whole or in part on the CEX (e.g., poverty and savings rates) and that a larger national sample is needed to calculate subgroup indexes.
In a somewhat more careful statement (but one lacking a specific recommendation), Triplett (1997:15) writes:
. . . The [CEX] sample size (5,000 consumer units) is certainty too small for almost any use for which one wants consumption data. . . . The recently announced increase from 5,000 to 7,500 [CEX] consumer units is a positive, but grossly insufficient, step. . . . The [CEX] is the federal government’s only general purpose survey of consumer expenditure. . . . For comparison, the Canadian consumer expenditure survey will soon have a sample size of 36,000.
. . . The [CEX]’s small sample size and lack of a benchmarking statistic means that its estimates for smaller components (e.g., household textiles) particularly are not as reliable as one would want for serious research on consumption. Also, the weights for the individual 207 basic components of the CPI are not
determined accurately from a [CEX] of only 5,000 consuming units, although it may also be true that the variance imparted into the overall CPI may be small.
Increasing the CEX sample size would also enhance its ability to support other potential uses. For instance, the current sample is not large enough to verify trends among population subgroups needed for a consumption-based poverty measure—especially for specific regions. Also, because it is the only U.S. survey that generates income and consumption microdata together, it is important for research on household savings behavior and on how that behavior varies along with age, income, and other factors. Other agencies (the Congressional Budget Office and the Department of the Treasury) use the data for modeling tax revenues and other research purposes. Better data would certainly improve research prospects in these areas as well.
Finally, in defense of BLS’s request for the approximately 50 percent increase in the CEX sample size, Commissioner Abraham testified (Bureau of Labor Statistics, 1998) that the increase “would let us produce superlative measures to a degree of precision comparable to the precision of the current CPI. . . . We currently use three years’ worth of data in producing the market basket weights for the CPI. For the superlative measures, you use two years’ worth of data, so with a 50 percent increase in sample size, we would have about the same precision in the weights.” But there is no reference to the targeted level of accuracy, nor of the impact of the increase in sample size on the precision of the current CPI computation. The commissioner’s statement merely says that the increase in sample size will enable BLS to estimate a CPI with the similar variance characteristics as those of the current CPI computation.
Given the current state of assessment, it is difficult to offer recommendations about the sample size of the CEX. The most pressing practical issues require weighing the cost of expansion against the advantages that changing the survey sample size would have on the accuracy of expenditure weights and, in turn, the relationship between weight accuracy and index variance. The panel carried out some analysis on this issue. The variance of the CPI, reflecting both the variance of the aggregation weights from the CEX and the price relatives from the C&S survey, is referred to by the BLS as the unconditional variance. The variance of the CPI, reflecting only the variance of the price relatives from the C&S survey (and treating the aggregation weights from the CEX as constant), is referred to by the BLS as the conditional variance. The ratio of the unconditional to the conditional variance is of the form
Thus, the impact of increasing the sample size of the CEX by a factor of f will result in a change in the ratio of unconditional variance to conditional variance to
Let t be an observed ratio of unconditional to conditional standard error. Then the impact of increasing the sample size of the CEX by a factor of f on t will be to reduce t to
Leaver and Valliant (1995:Table 28.2) provide ratios of the median unconditional to median conditional standard errors across time for major item groups for the period January 1987 through December 1991. Our panel received an update of these ratios for 1998 and 1999 for the “all items” and eight major item group series (Leaver, private communication). These ratios range from essentially 1.00 to a high of 1.23 (for the 1998 apparel index based on a 12-month price change). The all-items ratio ranged from 1.03 for an index based on a 1-month price change to 1.09 for an index based on a 12-month price change.
To see the impact of an increase in the sample size of the CEX, consider an extreme case—in 1998 the 12-month price change in the apparel CPI had a conditional standard error of 0.00811844 and an unconditional standard error of 0.00997372 for a ratio of 1.22853. Doubling the CEX sample size would have reduced this ratio to 1.120107. The apparel CPI went from 131.6 in December 1997 to 130.7 in December 1998. The 95 percent confidence interval for the December 1998 apparel CPI, based on the 12-month change from December 1997, is (128.1, 133.3). The comparable confidence interval, based on a doubled sample size, would be (128.3, 133.1). The panel therefore concludes that there is little evidence to support the recommendation to double the sample size of the CEX.
All this speaks to the index as a whole. One might also want to study the effect of increasing the CEX sample size on the variances at the basic CPI strata level. Acceptable (and optimal) error and variance levels must be defined specifically for the types of indexes that are desired; only then can they be evaluated against the cost of expanding the survey.5 In other words, one needs to determine the appropriate level of disaggregation at which to assess the effect of a change in sample size of the CEX.
The following list summarizes research that should be taken into account before BLS statisticians can definitively target an efficient sample size for the CEX:
Accuracy. As discussed in the previous subsection, it makes little sense to pursue more precise estimates of a biased measure. Differences in expenditure shares estimated by the CEX and PCE must be better understood and (at least partially) reconciled.
Precision level. If it is established that CEX is the best option for setting expenditure weights, BLS should establish precision requirements for the expenditure weights. The requirements must be informed by an understanding of how precise the CPI needs to be in terms of estimating the level and trend of the index. A primary driver for the sample size would be the extent to which population subindexes are desired. Precision requirements must also be established for other important uses of CEX as well, which may also have demographic or geographic dimensions.
Cost of expanding the survey sample. The cost of CEX operations should be examined in relation to survey size and design characteristics. BLS and the Census Bureau have a fairly accurate idea of how much it costs to expand the sample size (they now have the experience of the 50 percent increase). In addition to sample size, there may be a considerable clustering effect (both in terms of statistical performance and cost) in the CEX. What is the optimal scheme for clustering surveyed households and designating sampling units? Also, since the survey has many uses other than for BLS weighting, evaluations should consider whether BLS should bear the full budget burden of future changes to the CEX; a cooperative effort shared by the Office of Management and Budget, BEA, the Federal Reserve Board, etc., may be more appropriate.
Value of CEX. To redesign the CEX, or to expand its sample size, one needs to place a value on the inventory of all of its key uses. We know the CEX data are fundamental to the CPI. All uses, including the CPI, must be considered in making recommendations about the design or size of the CEX.
In addition to questions of frequency, sample size, and accuracy, there are a number of additional issues that involve assessing the information content of questionnaires and the general structure of the CEX. Many of the issues have already been addressed to varying degrees by the BLS and others. Improving the CEX will involve continued assessment of the effectiveness of the interview and diary survey approaches, what methodologies minimize underreporting of purchases or attrition from a diary panel, the appropriate universe of households and goods and services to be covered, and the role of incentives programs in increasing survey accuracy and reducing nonresponse. It will also require answers to
questions about how the mode of data collection might be modified to take advantage of new computer-based data collection methods, whether all expenditures for all item categories should be collected from all households surveyed or just some from each, and what processing system is required for the CEX in order to expedite development of a superlative index.
Answers to all three types of questions hinge on the types of indexes that BLS will be called on to produce. For instance, there are increasing demands for subpopulation and geographic (both price level and price change) indexes. Recommendations for modifying the CEX can only be reasonably determined after the BLS and policy makers decide the importance and value of calculating these special-purpose indexes. Assuming different expenditure weights apply to each, a much larger CEX sample will be required.
The Point of Purchase Survey
A second major survey input to the CPI is the POPS, which is used to determine which outlets BLS data collectors will visit in the C&S survey to record actual prices.6 The POPS produces outlet-specific expenditure information for item categories so that a sample of those outlets can be selected with a probability proportional to consumer use. The POPS is needed because the CEX does not ask consumers where they purchased goods. In addition to its role in selecting outlets to which BLS agents go to price specific items, POPS expenditure data are also used to implicitly assign quantity weights to all items priced within a single item strata (see Cage, 1996:fn. 14 for details). Within the current data support system, the POPS data have been improved in terms of their effectiveness at identifying outlets where households shop and as an input for averaging price quotes within CPI item cells.
The entry-level items (ELIs) in the CEX are not isomorphic with the POPS categories. Thus, some concordance and other adjustments are necessary to match the quantities from the CEX with the prices and price relatives determined from the C&S survey driven by POPS. In a newly designed data system, it seems likely that this mismatch could be eliminated.
There is a substantial overlap between POPS and CEX. If the CEX had no use other than to provide upper-level weights for the CPI, it would make sense to redesign POPS so that it would be the survey vehicle to perform this function as well. This change would then allow for greater index design flexibility, but it would probably increase the sample size required in POPS and also increase the response burden for each participating household.
The POPS provides sample outlets covering items that account for about 72.5 percent of the CPI (as measured in expenditure shares). A housing survey is used for shelter components of rent and owners’ equivalent rent, and other sources are used for a few other commodities and services (see the “Technical Notes” at the end of the chapter for additional information about the POPS).
The Commodities and Services Survey: Outlet Pricing
The number of price quotes that are collected is determined at the ELI and index-area level in a process called sample allocation. The stated objective of sample allocation is to produce the most accurate national-level all-item index possible, given the budget constraints. Through this process, item strata in each area are assigned a minimum number of price observations. In practice, this means that sampling rates are dictated, and will be higher, for ELIs that represent a large expenditure weight or display high price variability, as is the case with such items as apples and bananas (Lane, 1996).
The CPI’s C&S is a longitudinal survey that tracks changes in price quotes for most CPI-sampled consumer items over time.7 A few price quotes come from other sources: for instance, the CPI housing survey performs the same function for the shelter category. As described in Chapter 5, the specific items for which (and outlets from which) the C&S samples price quotes are rotated simultaneously. The POPS provides the sampling frames for outlets by producing estimates of expenditures for items in specific POPS categories (corresponding roughly with strata) at specific outlets. Based on POPS results, specific ELIs are assigned to each sample outlet. Each ELI has a checklist of product specifications so that a BLS field agent can identify specific items from the ELI category that are sold at the selected sample outlet. Field agents select a unique item (from within the preselected ELI category) for pricing based on a probability distribution of sales, with high-expenditure items (within that outlet) being more likely to be selected than low-expenditure items. The process whereby an agent narrows down the list of potential items from the ELI group to a specific item is called disaggregation (see Lane, 2000:9, for details). After a unique item is selected, the agent returns to the same outlet every month (or, in some cases, every 2 months) to record the price change. This process is repeated as long as the outlet continues to sell the item or until the outlet is rotated out of the sample. If the item is permanently discontinued, the agent consults a “characteristics” checklist and determines the most comparable replacement to price.
As discussed elsewhere in the report, problems may arise with this pricing system—for instance, if an item is first priced when it is on a special sale or when a specific item remains on store shelves long after a large reduction in its market share. BLS continues to explore methods for improving the quality of price data. The most visible experimental activities involve expanding the use of electronic data, which may offer such advantages as larger samples, reduced variances, more accurate determination of in-store sales shares, more timely publication of superlative indexes, and the potential to use unit pricing.
ALTERNATIVE DATA COLLECTION APPROACHES
Since most options for improving CPI input data, particularly those involving the household surveys, are expensive, and because there is methodological inflexibility under the current system, it is worth considering entirely new data alternatives. Of course, any net benefit of these alternatives hinges on exactly what types of indexes are desired—COLI or fixed-basket, national or regional, plutocratic or democratic, aggregate or subgroup. Other than the PCE-based expenditure weighting possibility, the two most obvious options for breaking from the current data system involve (1) combining POPS and CEX into an integrated survey that contains expenditure and outlet-use data at detailed product levels, along with household demographic information needed for subgroup indexes; and (2) moving toward scanner-based collection systems, which could be used to improve the existing surveys or as a component of an alternative. Current experimentation by BLS using scanner data illustrates its potential within the existing framework. Integrating scanner data as part of a POPS/CEX combined survey, or into a comprehensive household-based pricing system, would entail more radical shifts in CPI methodology.
One advantage of restructuring the entire data support apparatus would be that it could be designed to fulfill current indexing needs. However, as the environment and uses of the index change, even such an optimal data system moves toward obsolescence unless it is much more flexible than current systems. In this section, we examine some approaches to improving the data support system under the assumption that radical changes are one option.
An Integrated CEX/POPS Survey
The CEX and POPS were introduced at different times and evolved out of different needs in an uncoordinated way. The CEX was developed to provide detailed data on household-level expenditure patterns. BLS has been producing expenditure surveys in one form or another since the late nineteenth century; however their production was sporadic (usually not more often than every 10-20 years) in the early part of the century and was motivated by a range of different needs. The 1960-1961 survey was constructed with the primary purpose of revising weights for the CPI and was not limited to urban wage earners, as had typically been the case with previous surveys. The 1972-1973 survey was the first to use the modern interview and diary components, and the sample was selected on a probability basis (Jacobs and Shipp, 1990). The POPS was introduced to provide information about where consumers shop—information not provided in the CEX nor from existing sources of business sales-level data. Also, existing lists were typically based on the Standard Industrial Classification (SIC) system, which is not concordant with BLS-defined ELIs.
Because both CEX and POPS are household-based surveys, it is natural to
consider the possibility of merging the two into a single survey. Intuitively, it seems there should be economies of scale in combining them, as well as advantages to having more complete records (both expenditure and shopping pattern data) for each household. While we do think this possibility is worth investigating, there are many complicating factors. To begin with, the reference periods are now different for the two surveys. The quantity weights from the CEX require updating over a longer periodic cycle—formerly every 10 years, but now moving to every 2 years (without necessarily implying a change in the item structure every 2 years); outlet rotation weighting, based on POPS, is done every 4-5 years on average and, since POPS is a continuously rotating survey, a subset of items and areas is considered for change every quarter. Whether or not these are optimal frequencies has not been determined. It is possible that adequate rotation and weighting schemes could be produced from a single survey, but at present the issue remains largely unexplored.
The level of item detail needed to obtain CPI item strata weights and to select outlets and ELI samples is also different in the two surveys. Since POPS asks about product expenditure in greater categorical detail, it is generally believed that it requires a larger sample size to produce accurate probability schedules. It is possible that a unified survey could partition respondents into two or more groups, with some being asked more detail than others (something akin to the census short and long forms). Respondent burden could also be reduced if each household continues to be asked only about a subset of CPI items.
Defenders of the current system could also point out that a combined survey that generates expenditure, demographic, and outlet information concentrates respondent burden unnecessarily. Detailed demographic information is missing from the current POPS; outlet usage information and adequate sample size are missing from the CEX. A combined survey would likely entail greater demands on any given respondent, and the CEX is already considered one of the most burdensome government surveys. There is also a range of data quality issues that would require investigation. The CEX sample may be more representative of the population since it is based on samples drawn from census household files, not on random digital telephone sampling as is the POPS. Each CEX household also reports on a larger share of total household expenditures than does a POPS respondent. Further complicating the issue is the fact that the CEX is used for research and policy purposes other than the CPI.
The most obvious advantage of the multisurvey data system now in place is that—relative to the size of expensive consumer surveys—a large number of price quotes can be generated (and at a reasonable cost) for each specific item that is ultimately tracked by the CPI. This is because price data are not linked to specific households. Households provide just enough information for BLS to assign weights to broad item categories and to identify high-use outlets. If prices had to be gathered from households in the manner laid out in Chapter 8, the survey would presumably have to be much larger (than either the current POPS or
CEX) to ensure an adequate sample of prices for each ELI area cell for the CPI. Yet the real advantage of a survey that links prices paid for specific items to the purchasing households is that, in principle, from such data one could calculate average prices paid for specific items by different household types. The big question is what size household sample would be required to support such an index or, more realistically, how big a sample would be needed to make an experimental pilot project work. This question is discussed in Chapter 8.
In this section we outline how scanner data work and identify some potential operational and measurement benefits that may be gained by increasing their use; we also point out limitations. However, reflecting the panel’s charge, the primary emphasis is on how the use of scanner data (and electronic data in general) might allow greater conceptual flexibility when constructing price or cost-of-living indexes. The discussion comments on the extent to which current BLS research and experimental programs may affect CPI pricing procedures. The panel also assesses the value of incorporating scanner-based pricing methods within the context of its more general recommendations concerning the feasibility and advisability of pursuing a COLI approach. We first look at the potential of point-of-sale scanner data and how it could be used to improve data accuracy and price collection procedures. We then look at the more futuristic idea of household-based scanner data.
Point-of-Sale Scanner Data
The most obvious way in which scanner data could be used to support the CPI would be as a replacement for or supplement to the C&S survey of outlets. Scanners in retail outlet checkout counters record Universal Product Codes that identify specific products and their manufacturers. These data are collected, collated, and sold by two major producers of scanner data: ACNielsen and Information Resources, Inc. (IRI).
A growing literature on the topic is beginning to provide an indication of the feasibility, as well as the benefits and drawbacks, of using scanner data in the production of price indexes. While academic researchers in both the United States and Europe have begun exploring how scanner data could be used to improve the statistical properties of price indexes, BLS has moved to the forefront on work in the area.8 Reinsdorf (1996) successfully constructed a basic item-level index for coffee using scanner data. Currently, the BLS’s ScanData initiative is producing
indexes for breakfast cereal in the New York City area from data provided by Nielsen. To date, Laspeyres, Tornqvist, and geomean indexes have been produced; Paasche and Fisher indexes are under consideration. The BLS team is moving to construct the index for broader geographic areas as well. As additional areas are added, they will use current CPI aggregation weights and Laspeyres formula (Richardson, 2000:11).
Scanner data offer several potential advantages. First, such data could streamline item pricing procedures. Using computer-captured scanner data could reduce the number of manual steps in the C&S survey required to produce subaggregate indexes. Scanner price data may replace or reduce the need to visit stores to price items.
Second, scanner data could generate a more representative selection of items for pricing. Scanner data include the universe of products sold (at outlets that have scanner technology), whereas the current quote sampling method only records prices for a small fraction of items on store shelves. CPI price quotes are drawn from items at outlets made eligible by selection in the most recent POPS sample. Scanner quotes are available if the item has been sold during the pricing period. For the CPI, BLS collects prices for selected items whether or not they have been sold at the POPS-identified outlet. In contrast, transactions scanner data pick up volume of sales. Some stores also maintain files that drive the price identification system and indicate the shelf price of all items for some period, such as a week, whether or not they were sold. CPI outlet and item samples are rotated periodically, every 4 years under current practice. In contrast, since scanner data can include the universe of transacted prices at covered outlets, samples are refreshed continuously and new items appear in the data much more frequently. For the BLS’s ScanData geomean and Laspeyres test indexes, weights and item samples are updated each year on the basis of the previous year’s expenditure patterns (Richardson, 2000).
Third, scanner data could improve sampling accuracy. Scanner data have introduced new capacity to calculate highly accurate average prices for specific commodities. The large number of outlets and item price points associated with scanner data offer the potential to greatly decrease sample variance and improve data precision. As pointed out by both the Boskin commission (Boskin et al., 1996) and the Conference Board (1999), the high volume of scanner data would allow for production of indexes at finer levels of product detail. Additionally, scanners record actual transaction prices, not shelf prices at which transactions may or may not have taken place for the relevant period. These features may help certain data users, particularly those that perform industry studies or types of analyses where average price movement over fairly short periods is more relevant than shelf price at a given point in time. The tentative result of BLS’s ScanData New York experiment—which provides some evidence as to how far these scanner data may improve underlying data quality—have been quite promising. The indexes produced from scanner data have displayed less variability than the CPI
sample price counterpart. For cereal in New York, the sample size of price quotes is more than 1,400 times the number in the traditional CPI data. However this translates into a reduction of standard errors, such an increase should create greater index precision. Though it was surmised (Richardson, 2000) that this would reduce the standard error of the cereal CPI by a factor of about 38, a careful study of these data by Leaver and Larson (2001) shows that the reduction in the standard error was by a factor of about 6.
Fourth, scanner data could expand geographic coverage for the CPI. Nielsen compiles scanner data from all states and regions (except for Alaska). Data from nonmetropolitan-area outlets are also available. In contrast, the CPI uses data from only 87 metropolitan areas.
Fifth, scanner data may allow more systematic data-cleaning procedures. Scanner data are more uniform and may be simpler to process for index use. However, data-cleaning rules used by ACNielsen or IRI are different from those at the BLS, particularly in how missing or erroneous prices are imputed. This would become an issue for any index that uses scanner data only for a subset of item categories, while traditional methods continue to be used for the remaining item categories.9
Finally, with scanner data, it will be possible to produce price averages (or unit valuations). Scanner data allow transaction prices to be averaged over the relevant period. Unlike BLS pricing methods, scanner datasets are typically produced using aggregated unit values—a quantity-weighted average price of an item. The simplest version is calculated as sales revenue divided by number of units sold. Unit values are used in most basic item indexes in the world; however, this is not the case with the CPI, since a weight is assigned to each price quote (Richardson, 2000).
This last issue, unit pricing, requires further discussion, since it is not transparent that it is conceptually superior to the current practice of pricing items on store shelves at a point in time. The main criticism of unit pricing is that it produces a price at which no single item may actually have been sold.10 On the other hand, the ScanData team argues that “the unit value index more accurately
reflects the preferences of the shopper who searches out the lowest prices each week, and also the consumer who stockpiles during a particularly good special, but then purchases nothing until the next special” (Richardson, 2000:12). In some instances, few consumers purchase at the shelf price that the BLS agent happens to observe. How many people buy Chicken-of-the-Sea tuna fish when the Bumblebee next to it is on sale for half price? Feenstra and Shapiro (2001) cite marketing literature indicating that there is substantial consumer substitution across weeks in response to price changes and advertising. Also, their own data on canned tuna show a high degree of price variation and substantial response of consumer demand to that variation (Feenstra and Shapiro, 2001). Using shelf prices assumes rigidity in consumer shopping behavior, since items in each week of pricing are treated independently and that elasticity of substitution among them is zero (Richardson, 2000). Proponents of unit value pricing argue that is it better to consider purchases in different weeks of a month as purchases of the same good in the context of consumers’ utility. It is certainly worth noting as well that, at some level, price averaging must take place to construct any price index.
Whatever the outcome of these specific questions, it is clear that scanner data allow researchers to look at all sorts of interesting things. They facilitate comparisons of series that combine price data in different ways, including alternative index formulas, such as short time-lag superlatives. The ScanData team, for instance, was able to compute several indexes contemporaneously (using a Paasche construction as the lower bound with which to test other indexes). Additionally, the sheer volume and detail of scanner data also facilitate hedonic analyses of quality change (such as Ioannidis and Silver, 1999). Even when scanner data are ultimately not used to construct an index, availability of the data can only advance the pace of research that leads to improvements in the index generally.
Early results for the ScanData cereal test indicate that introduction of scanner data may have a significant effect on index performance. For the February 1998 through June 2000 period, cereal inflation for the New York metropolitan statistical area, as measured by the CPI, rose from (a re-based) 100 to 101.1. The geomean scanner index completed the series at 104.9. This 3.8 percent difference may have been attributable to several factors. First, the universe of outlets for the two indexes was not identical; ScanData was missing data from a wholesale club. There was also a sharp decrease in the regular CPI for cereal in October 1999 that did not appear in the scanner data and is difficult to explain. Also, the Tornqvist index rose more rapidly than did the geomean, indicating that, at least for cereal in New York, elasticity of substitution is less than 1.0, as assumed under the geomean method (Richardson, 2000).
It is also important to assess the extent of practical advantages of scanner data that might add to the viability of its regular use. The ScanData experiment has produced favorable results in a number of areas showing, specifically, that:
Scanner indexes can be produced on the current CPI schedule. Regarding quote timing, CPI and scanner data cover similar periods within the month; scanner data have the advantage of covering weekends and holidays, which CPI data do not.
For many cases, scanner data cover the entire domain of products within any given item strata and area cell, which is important for methodological consistency.
The scanner indexes can be produced in a manner generally consistent with BLS sampling procedures.
The sample is rotated and can be refreshed at least as often as under current CPI practices.
Indexes work with both standard geomean and superlative formulas.
The cost implications of introducing scanner data and reducing field price observations have yet to be fully evaluated by BLS.
Limitations of Store-Based Scanner Data
Despite the numerous potential advantages described above, issues remain to be sorted out before BLS can proceed toward systematic integration of point-of-sale scanner data into the CPI; these issues relate to pricing, coverage (both geographic and item-specific), cost, integration of scanner data with other data sources, and reliance on private-sector data.
In addition to unit valuation (already discussed), pricing issues include treatment of taxes and comparability between private-sector scanner data and Census Bureau/BLS data. The CPI collects prices without sales taxes; then a calculated tax is applied separately using secondary data. Scanner data also do not include taxes. However, since ACNielsen does not disclose the exact location of outlets, it is not always clear what tax rate should be added to item prices. For the cereal experiment, it was not a problem since New York has no tax on most food items. However, in general, a solution to this problem needs to be found by vendors or BLS. One possibility would be to calculate a population-weighted average sales tax each month for each item based on the outlet usage patterns of consumers in each geostrata (Richardson, 2000).
Coverage issues include geographic definitions and saturation of scanner equipment. Geographic-area definitions for the CPI and for currently produced scanner datasets do not match. Scanner data markets are generally smaller than the census-defined metropolitan areas on which the CPI is based. ACNielsen is currently working to map most of the United States into CPI geographic areas, though when the project is complete, there will still be some gaps (e.g., ACNielsen does not cover Anchorage). Even for the covered areas, scanner price data are not available for all outlets at which items from any given CPI strata are sold. In the
cereal experiment, there were CPI quotes that were not included in the scanner universe (in this case, they were from mass merchandisers). Small mom-and-pop stores also frequently do not use scanner technology. Efforts are currently under way at ACNielsen to expand the depth of outlets covered in its datasets. Also, “migrating” quotes come into play when purchases are made across CPI areas. The POPS sample covers purchases in adjacent areas, but these patterns cannot be inferred from scanner data. In other words, the POPS covers purchases of consumers from a certain area while the scanner datasets cover purchases made by any household in a particular area, which is not the CPI objective. It may be possible to construct a scanner index as a weighted index from the areas in which consumers of a given area shop (Richardson, 2000), but this certainly adds complication back into the system.
Scanner data coverage is most broad based for items sold in supermarket outlets, while there is virtually no coverage in service sectors. Hawkes and Piotrowski (2000:1) of ACNielsen report that 43 of the 211 CPI item categories can “in large measure, be represented through scanning data obtained from Supermarkets, Mass Merchandisers, and Drug Stores.” These categories account for about 10 percent of all consumer expenditures and about 24.2 percent of expenditures for goods (excluding services such as rent). Item coverage constraints alone severely limit the impact that use of store-based scanners can have on the overall CPI.
In terms of cost, the budget tradeoff between purchasing data from private vendors and traditional price data collection must be evaluated, as BLS is in the process of doing. Another issue concerns integration of scanner-based subindexes (possibly superlative) with traditional sampling-based item indexes: What are the statistical and index performance ramifications when subindexes are compiled using different types of data?
Finally, BLS currently does not have to rely on private outside sources for fundamental pricing data. The ramifications on CPI production of changing this must be explored. For instance, ACNielsen and IRI buy their data from chains, and at times chains decide to no longer sell these data. This means that, while a given store has a positive probability of being in the traditional CPI sample, its probability of being in the scanner dataset is zero. Thus, problems of continuity with the scanner data universe could arise.
Household-Based Scanner Technology
Household scanner technology could be adopted in one of three ways: it could be used to improve the accuracy and coverage of the current household surveys, particularly the CEX; it could also be used in a combined CEX/POPS survey; or, more ambitiously, it could be the technical centerpiece of a house-hold-based panel survey that produces both expenditure share and price information that would be used to produce household or subgroup indexes. Any plan to
augment the CEX would require members of sampled households to use handheld scanners to report UPC items and quantities. These data could be enhanced by having the household members key-enter prices as well. BLS could develop scannable menu codes for non-UPC items, which the sampled household could then use to help enter quantities and prices. In addition to this information, household members could be asked to report the store name and address associated with each purchase. One might bypass some of the household recording of prices if the reported store is one from which prices for UPC items can be obtained directly.
Potential to Enhance Accuracy of the CEX and POPS
Even before considering price issues, household scanner devices could increase the quality of current surveys by improving the accuracy of households’ documentation of purchases. It could produce more accurate and detailed weighting from the item strata to sub-ELI levels. Household scanning technology could help reduce errors associated with improper identification of products and prices and reduce recall and incomplete information (about location of purchases, for instance) biases.
The technology creates greater breadth and depth of information by tracking product and buyer characteristics and offering more uniform geographic coverage (rural areas, all age groups, etc.), thereby expanding the potential to develop subgroup indexes. It could cover purchases made at outlets that do not use point-of-sale scanning, and even in sectors that do not, if supplemental code sheets could be developed for respondents to scan. Lastly, household scanner technology may reduce respondent burden.
The possibility also exists that new types of errors (e.g., keying) could be introduced; this possibility would have to be examined in pilot projects. Pilot projects would also be important for determining whether introduction of this technology into the survey affects the demographic composition of the sample (e.g., bias it away from inclusion of the elderly).
Scanner Technology as a Tool for Moving the CPI Toward a COLI
Independent of whether or not the CPI should be based on a COLI framework, scanner data may be used to help overcome a few of the obstacles that now preclude calculation of anything like a COLI. By providing simultaneous information on prices and quantities, scanner data may reduce the lag in the production of superlative indexes and also enable Paasche indexes to be produced. Under current practices, price and quantity data are produced from different samples and at different frequencies.
Much caution is in order here, though. Feenstra and Shapiro (2001:21) found, in their construction of superlatives using scanner data for tuna fish that “the
calculation of conventional price indexes. . . shows substantial pitfalls of mechanically applying price indexes to such data.” The superlative index is intended to capture reductions in the cost of living as consumers substitute goods that have decreased in price for those that have increased. However, the superlative index calculated by the authors fails to produce this result (the superlative index grew faster). Feenstra and Shapiro (2001:22) concluded:
The consumer behavior that generates these data cannot correspond to the static utility maximization that provides the foundation for superlative index numbers. Our tabulations suggest that the index numbers do not properly account for consumer behavior in response to sales. In particular, the chained Tornqvist gives too much weight to price increases that follow the end of sales.
The authors go on to explain that their findings reflect purchases made for storage rather than immediate consumption. In other words, purchases and consumption do not track in a parallel fashion, particularly for items that can be stored. As such, the consumer does not face as much an increase in price (after sales) as the raw data imply. In addition, advertising contributes to the breakdown of the law of demand that is assumed under the superlative index approach: “If advertisements cause consumers to purchase a [larger] quantity than would be consistent with static maximization of a time-invariant utility function, superlative index numbers will not accurately measure the cost of living” (Feenstra and Shapiro, 2001:22). On the basis of their findings, they conclude that unit values might provide a good approximation for construction of a COLI but should be adjusted to reflect consumption and should be adjusted to account for storage costs.11
Many of the general advantages of scanner data noted above may also help to address other CPI biases. For instance, scanner data allow for quicker and more accurate identification of both new goods and item attrition (and, as such, could have the capability to reduce new goods bias), as well as of outlet substitution patterns. Furthermore, scanner technology generates more detailed data for hedonic regression and other quality adjustment methods (although quality change bias is probably less of an issue for food items—the potential may be greater in areas such as consumer electronics) and also produces empirical evidence that may allow researchers to estimate the impact of quantity (and other types of) discount pricing on index growth.
SUMMARY AND RECOMMENDATIONS
Without the benefit of extensive research on each of the areas raised in this chapter, the panel cannot make many definitive recommendations with respect to the data inputs to the CPI. We recognize that the BLS has undertaken research projects in these areas, and so BLS’s inclusion in our discussion should not be taken as an indication that it has been negligent in its research efforts. It merely means that the panel recognizes the importance of these areas of research and hopes that they will continue systematically and thoroughly.
Research into the accuracy and sample size of the CEX should be a high priority among topic areas relating to the data collection process for the CPI. The panel concluded that it is likely that CEX estimates of consumer expenditure shares are biased, perhaps seriously. There is no obvious benefit to increasing the survey sample size if nonsampling error dominates sampling error—one would simply be achieving more precise estimates of the wrong thing.
Recommendation 9-1: Before additional resources are directed toward increasing its sample size (beyond the current plan), the accuracy of the CEX should be carefully evaluated. Assessing the net advantages of using the BEA’s PCE to produce the upper-level weights for the national CPI should be part of this evaluation.
At the very least, research by BLS (and BEA) into the sources of divergence between PCE- and CEX-derived expenditure weights needs to be extended so that these differences can be more fully understood. Even if the current system is ultimately maintained, the effort will produce additional guidance about how the CEX might be improved.
Recommendation 9-2: If categories can be reasonably well matched between the CPI and PCE, so that comparable item strata indexes can be created, a program should be set up to produce an experimental CPI that uses PCE-generated weights at the upper (218 item) level but that is otherwise no different from the CPI.
If full item-by-item mapping turns out to be too problematic, it might still be possible to use PCE estimates for major item categories where the PCE and CEX have comparable coverage. For such categories, estimated totals from the CEX could be forced to equal the PCE estimates, which might allow the PCE to correct for undercoverage in the CEX in much the way that demographic projections are used to correct for undercoverage in household surveys such as the Current Population Survey. The distribution among lower-level aggregations would be determined by the CEX distribution. Investigating how well such experimental indexes perform seems especially sensible given the high cost of revamping the CEX survey or increasing its sample size. We would very much like to see a
thorough defense of the choice of CEX-generated upper-level weights, relative to the alternatives, for the national-level CPI.
The CPI data collection process would also benefit from research in several other key areas discussed in this chapter:
Frequency of the CEX: a combined theoretical-empirical study of the impact on the CPI of the frequency of updating weights.
Sample size for the CEX: a combined theoretical-empirical study of the effect of CEX sample size on the variance of the CPI and on any subindexes that are desired.
CEX and POPS survey design: a comprehensive reexamination of the design of each of these surveys.
Integration of CEX and POPS: a study of the feasibility and requirements of a for-CPI-use-only single survey encompassing both the CEX and POPS.
Recognizing that scanner technology has the potential to improve the entire process of data collection for the CPI computation, the panel also identified the following key study areas:
Point-of-sale scanner data and item selection: continuation of research on how these data can be used both to select items for pricing and to replace the C&S Survey and a quantification of the improvement in the CPI based on their use.12
Point-of-sale data and outlet selection: initiation of research on how to use store sales information based on scanning to determine the stores to be sampled in the C&S.
Household scanner data: initiation of research on the use of handheld scanners to record UPC items and quantities along with key-entering prices and/ or store names and addresses.
Integration of UPCs into BLS ELI framework: development of a concordance between UPCs and the ELIs.
Integration of non-UPCs into BLS ELI framework: development of BLS assignments of UPCs for items which otherwise do not have UPCs for use in household handheld scanning.
Experimental development of subgroup indexes: performance of the household-based price data experiment, likely involving household scanner technology, to produce subgroup indexes that capture variation in both expenditure weights and prices paid.
TECHNICAL NOTE: ADDITIONAL DESCRIPTION OF CPI DATA INPUTS
The Modified Laspeyres CPI
The “Technical Note” to Chapter 2 sets forth the mathematical derivations underlying the development of the recommendations in this report. Equation (1) of that section sets forth the Laspeyres price index , namely
relating base period quantities base period prices and current period prices for each of N goods (where the superscripts 0 and t refer to the base and current periods). The actual BLS-reported CPI differs from this in a few respects. First, the index is reported relative to a period in which it was set equal to 100. This period has, since 1987, been July-August 1983; prior to that, it was January 1967.
Second, and of more critical importance, the above equation is based on the assumption that both prices and quantities are collected simultaneously in the base period, but this is not the case for the BLS-reported CPI. For the CPI, the base period quantities are based on data from a household expenditure survey, while the base period prices are based on data from the monthly pricing surveys. Since the quantity data take longer to compile than do the price data, what is instead calculated is a “modified Laspeyres index,” namely
where, as before, n indexes the N goods and the superscript t denotes the current period, but where the superscript 0 refers to the quantity-base period (sometimes called the expenditure-base period) and the superscript a refers to the price-reference period. Since January 1998, the quantity-base period has been 1993-
1995; prior to that it was 1982-1984.13 It is planned that, as of January 2002, the quantity-base period will be 1999-2000, and that it will be updated at 2-year intervals subsequently using information from the Consumer Expenditure Survey (CEX) ending 2 years prior to the update.
Finally, the are themselves not directly observed in the household expenditure survey. Rather, the survey provides quantity-base period expenditures for item n, and the quantities are calculated by dividing by where the quantity-base period prices are obtained from the monthly pricing survey.
The CPI can be expressed as a multiple of a Laspeyres index and the reciprocal of a modified Laspeyres index based on the quantity-base period and price-reference period, namely
As seen above, PLs,0 is a constant that relates the quantity-base period to the price-reference period. The critical element of the index is indeed PLt, which can be rewritten as
This index can be characterized as a “chained” index, where the previous period’s index PLt-1 is multiplied by a dollar-weighted average of price relatives, with the dollar expenditure weights being those of the quantity-base period quantities priced at the previous period’s prices and the price relative taken with respect to the price in the previous period. One should note that what is reported monthly by BLS is the period-to-period index, namely PLt/PLt-1.
Elements of the Index and Subindexes
The “goods” used in the CPI are organized into expenditure classes (ECs); as of 1999, there were 68 ECs. These are in turn are subdivided into item strata; as of 1998, there were 218 strata. Finally, the item strata are subdivided into entry-level items (ELIs); as of 2000, there were 282 ELIs.14 The following is an example of this hierarchy of goods (Bureau of Labor Statistics, 1997a):
Expenditure Class 24: Maintenance and repair commodities
Item stratum 2401: Materials, supplies, equipment for home repairs
24011: Paint, wallpaper, and supplies
24012: Tools and equipment for painting
24013: Lumber, paneling, wall and ceiling tile; awnings, glass
24014: Blacktop and masonry materials
24015: Plumbing supplies and equipment
24016: Electrical supplies, heating and cooling equipment.
Item stratum 2404: Other property maintenance commodities
24041: Miscellaneous supplies and equipment
24042: Hard surface floor covering
24043: Landscaping items
Subsequently, the ECs and their components were redesignated; in the 26 March 1999 list of ELIs, EC24 has been restructured as:
Expenditure Class HM: Tools, hardware, outdoor equipment and supplies
Item stratum HM01: Tools, hardware, and supplies
HM011: Paint, wallpaper tools, and supplies
HM012: Power tools
HM013: Miscellaneous hardware, supplies, and equipment
HM014: Nonpower hand yools
Item stratum HM02: Outdoor equipment and supplies
HM021: Powered lawn and garden equipment and other outdoor items
HM022: Lawn and garden supplies and insecticides
The data used in the CPI are collected in 87 primary sampling units (PSUs; see Williams, 1996). The data are aggregated into 54 basic areas—34 self-representing areas (e.g., Kansas City, MO-KS) and 20 region- and population-size
cross classifications (e.g., Midwest Size A).15 The basic areas and item strata combine to form (218 × 54) = 11,772 basic CPI strata. Note that each of these basic CPI strata may be comprised of more than one ELI and more than one PSU.
Let h index the basic areas (h = 1, . . . ,54) and z index the item strata (z = 1, . . . . , 218). Until January 1999, BLS calculated Rthz—an estimate of the relative price change in basic area h, item stratum z, from period t - 1 to period t—using the formula when the samples of items within the item strata are selected with each unit having a probability proportional to quantity, or the formula
when the samples of items within the item strata are selected with each unit having a probability proportional to expenditure. In both forms the weights whi reflect the probability that item i in item stratum z is selected to be priced in basic area h—in the first of these the weights whi are essentially in the second the weights whi are essentially where phi is the probability that item i in item stratum z is selected to be priced in basic area h. Since January 1999, they have replaced this computation for most indexes (the housing index being the most notable exception) with a weighted geometric mean, namely
When one can obtain prices in basic area h for the universe of items in item stratum z, for both time periods t - 1 and t, then Rthz is given by the weighted average
or, if the geometric mean computation is used, is given by
where whi is the ratio of the expenditure in basic area h on item i of item stratum z to the expenditure in basic area h on all items of item stratum z. Since a census of the prices for the universe of items in item stratum z is impractical, BLS
estimates the Rthz. An oversimplified version of the BLS procedure is the following: Let a sample of N items be drawn from the universe of items in item stratum z (with replacement), with the probability of selection of item i equal to whi. Then
is an unbiased estimate of the weighted average version of Rthz, and
is a consistent estimator of the geometric mean version of Rthz.
The BLS then updates its index for this basic stratum by the chaining formula described earlier, namely,
These indexes are aggregated to form indexes for aggregate areas (e.g., U.S. cities), aggregate items (e.g., expenditure classes), or both. Let H denote the aggregate area and Z the aggregate item for which an index is to be formed. The index for this aggregate area and item is calculated as
where jez denotes the items drawn from the universe of items in item stratum j.
Consumer Expenditure Survey
The CEX, sponsored by BLS and conducted by the Bureau of the Census, is a national probability sample of household units. It is comprised of two parts, a Quarterly Interview Panel Survey and a Diary Survey. Each “consumer unit” in the household selected for the Quarterly Interview Panel Survey is interviewed for 5 consecutive quarters about relatively large expenditure items (e.g., major appliances) and expenditures that occur at regular intervals (e.g., utility bills). A sample of 8,910 addresses are contacted for the Quarterly Interview Panel Survey in each of the calendar quarters, and the number of completed interviews per quarter is targeted at 6,160. Each consumer unit selected for the Diary Survey completes a diary on expenditure information on frequently purchased items and
relatively small expenditure items for 2 consecutive weeks. A sample of 8,020 addresses are contacted each year to participate in the Diary Survey, so the effective annual sample size participating in this survey is 5,870 households, spaced across the 52 weeks in the year. The CEX has many uses in the governmental statistical framework. Its primary use in the CPI computation is to construct the quantities qhz0 which underlie the CPI computation. It has also been used “to select new market baskets of goods and services for the index, to determine the relative importance of components, and to derive new cost weights for the baskets” (U.S. Department of Labor, 2000).
Point of Purchase Survey and Commodities and Services Survey
The goal of the Point of Purchase Survey (POPS) is to determine the prices to be used in the CPI computation. The first stage of this survey is a national probability sample of household units, conducted by the Census Bureau, whose primary aim is to define the outlets to be sampled to obtain price data. The survey began in 1978 as a personal interview (and was referred to as CPOPS, for Continuing Point of Purchase Survey). In 1999 BLS revised this survey as a telephone interview, referred to as TPOPS (for Telephone Point of Purchase Survey). CPOPS was conducted annually over a period of 4 to 6 weeks, usually beginning in April; TPOPS interviews households every quarter. In CPOPS approximately one-fifth of the PSUs were sampled each year; the goal in TPOPS is to increase this sampling rate so that one-fourth of the PSUs are sampled each year. All consumer units in the selected household are asked to recall whether or not they purchased categories of goods and services within a specified recall period (varying from 1 week to 5 years, depending on the purchase cycle of the category) and, if so, the expenditure amounts and the names and locations of all places of purchase. Based on the responses to this survey of household units, a frame of outlets is defined for outlet selection. Since approximately one-fourth of the PSUs are currently sampled each year, after the survey of household units the frame of outlets determined by the survey is unchanged for 4 years.
The commodities and services are grouped into POPS categories, consisting of combinations of some of the ELIs; there were 174 POPS categories in 1997 (Bureau of Labor Statistics, 1997a). For example, POPS category 127, materials and supplies for major home repairs, consists of two of the ELIs of item stratum 2401, ELIs 24013 and 24014. POPS category 129, hardware items, hand tools, and other materials for minor home repairs, contains the other four ELIs of item stratum 2401—24011, 24012, 24015, and 24016; it also contains ELI 24041, miscellaneous supplies and equipment; ELI 32043, other hardware; and ELI 32044, nonpowered hand tools.
For the purpose of outlet selection, the BLS has aggregated the POPS categories into eight categories and the PSUs into ten groups (see Bureau of Labor Statistics, 1997a:). After a PSU group has been surveyed, the ELIs to be priced
for the C&S Survey are selected with a systematic sampling procedure, with probability of selection proportional to the amount of expenditure in that PSU group and its item stratum. This systematic sampling procedure guarantees that over the 4-year period each of the ELIs will be selected for pricing. The outlets actually sampled from each frame are selected independently for each PSU group and POPS category, with probability of selection proportional to the amount of expenditure in that PSU group and POPS category. To give readers a sense of the number of outlets selected, the largest number is nine, in the POPS foods and beverages category, PSU group Philadelphia.
At a selected outlet a BLS field representative uses a multistage probability selection procedure for selecting the specific item to be priced among those that the outlet sells that fall within the designated-for-pricing ELI definition. The probability of selection is, if the information is available, proportional to the sales of the items in the ELI groups. Otherwise it is either based on the proportion of shelf space or, as a last resort, assigning equal probability to each item. Once the item is selected, its price is recorded. These are the prices that are weighted and used in the computation of the Rhzt used in the CPI computation.