Read "At What Price?: Conceptualizing and Measuring Cost-of-Living and Price Indexes" at NAP.edu

« Previous: 8 Whose Index? Aggregating Across Households

Page 252 Cite

Suggested Citation:"9 Data Collection for CPI Construction." National Research Council. 2002. At What Price?: Conceptualizing and Measuring Cost-of-Living and Price Indexes. Washington, DC: The National Academies Press. doi: 10.17226/10131.

Page 253 Cite

Page 254 Cite

Page 255 Cite

Page 256 Cite

Page 257 Cite

Page 258 Cite

Page 259 Cite

Page 260 Cite

Page 261 Cite

Page 262 Cite

Page 263 Cite

Page 264 Cite

Page 265 Cite

Page 266 Cite

Page 267 Cite

Page 268 Cite

Page 269 Cite

Page 270 Cite

Page 271 Cite

Page 272 Cite

Page 273 Cite

Page 274 Cite

Page 275 Cite

Page 276 Cite

Page 277 Cite

Page 278 Cite

Page 279 Cite

Page 280 Cite

Page 281 Cite

Page 282 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

9 Data Collection for CPI Construction T he data used to calculate most Consumer Price Index (CPI) subindexes originate from three different, though interrelated, sample-based sources: the Consumer Expenditure Survey (CEX), the Point of Purchase Survey (POPS), and the Commodities and Services (C&S) Survey.1 These surveys have evolved through time, and one of them, the CEX, has multiple uses in the govern- ment statistical system. Moreover, the data collection structure itself influences what indexes can and cannot be produced. For example, the current data system does not allow for production of non-urban-area indexes or regional price-level comparisons; nor does it support accurate price indexes for subpopulations such as the elderly, minorities, or the poor, particularly at subnational levels. Also, in order to reduce respondent burden, households are only asked about a portion of CPI item categories, which also inhibits the construction of some potentially useful, alternatively weighted (e.g., democratic) indexes. There are two distinct approaches one can take when considering how the data underlying CPI computation might be upgraded. One is to assume that the basic data collection structure will remain as it is and then to seek ways of improving each of the survey components. Another is to redesign the entire data collection structure so that it reflects advances in data collection technology and so that the data collected are more consonant with the ultimate computation of the CPI. This second option would require a transition plan that takes the data system 1In addition, the CPI housing survey is used to calculate changes in rent of primary residence and ownersâ equivalent rent (the two largest components of shelter). 252

DATA COLLECTION FOR CPI CONSTRUCTION 253 from where it is today to where we aspire it to be. In this chapter, we discuss these two options. THE CURRENT DATA COLLECTION PROCESS The Consumer Expenditure Survey The CEX is the primary tool for establishing CPI weights at the basic (218) item level. It is the most comprehensive source of combined household income and expenditure data produced by the statistical system; it is also very expensive to conduct. Nonetheless, a growing consensus is emerging among policy re- searchers that improvements should be made to the CEX. Probably the most frequently voiced criticism has been that the sample size is too small for the survey to be used for the range of applications to which it is currently put. However, another shortcoming of equal or greater importanceâat least in the context of CPI constructionâstems from nonsampling-related inaccuracies, such as survey response bias. Suggestions for improving the surveyâs questionnaire design and substantive scope can be found in the research literature; in this section we review recent recommendations for upgrading the CEX, after first posing several questions that must be answered before a fully informed decision to change the survey can be made. Accuracy The panelâs foremost concern is with the extent of bias in the CEX which, in turn, affects the accuracy of CPI expenditure category weights. A starting point for evaluating household expenditure allocations estimated by the CEX is to compare them against weights generated by other sources. The Bureau of Eco- nomic Analysis (BEA) produces the most obvious alternative, the per-capita personal consumption expenditures (PCE) data, as part of the national income and product accounts (NIPA). During its postsurvey evaluation program, de- signed to identify areas in which the CEX could be improved, the Bureau of Labor Statistics (BLS) does compare the expenditure pattern of the CEX with that shown in the PCE component of the NIPA (Branch, 1994). Such compari- sons might, depending on the outcome, raise a second question: Why not use, for the national CPI, upper-level weights derived from aggregate-level data, such as the PCE?2 2We note that a price index is already constructed using the PCEâthe chain price index for personal consumption expenditures, or PCEPI. See Clark (1999) for a description of the differences between the PCEPI and the CPI in terms of index formula, scope of goods and services covered, underlying price information, and index performance.

254 AT WHAT PRICE? In considering such an option, one must (1) judge whether or not the PCE weights are really superior for this application and (2) if they are, determine whether the CEX would still be needed for the CPI program. The answer to the second question depends in part on the value placed on area and group indexes, which could not be constructed using NIPA data. Budgetary considerations aside, there is no inherent reason why BLS could not produce a flagship CPI using NIPA-based upper-level weights, while producing other indexes based on the CEX. Let us first address the accuracy issue. Branch (1994) provides a comparison of CEX and PCE expenditure categories for the period 1992-1995. The compari- son is limited to the universe of categories that are comparably defined; this leaves out two major onesâowner-occupied housing and health care. For a few categories, such as rental rates, utilities, and vehicle purchases, the correspon- dence ratio (CEX weight divided by PCE weight) is near 1.00, which is what one would generally hope for. For equivalent rent of owner-occupied dwellings, the CEX expenditure weight is much larger than shown in the PCE data, by a ratio of almost 2 to 1. For all other categories, though, the CEX expenditure weight is much smaller, and many are in the 0.4-0.6 range. This discrepancy calls into question the accuracy of the CEX weights. There is also a problem (documented in Triplett, 1997) that the total expenditure of households implied by CEX and PCE weights is drifting further apart, perhaps by as much as 1 percent a year. One should not jump to the conclusion that these differentials imply an accurate PCE and an inaccurate CEX, but the wide discrepancies clearly warrant further inves- tigation since both sets of expenditure weights cannot be correct. What is known about the relative strengths of the PCE and the CEX data? For certain types of expenditure categories, well-documented sources of house- hold response error damage the credibility of CEX weights. Triplett (1997:15) states that âreporting biases are known to be serious in some consumer expendi- ture components.â For instance, households may underreport âviceâ products such as alcohol or tobaccoâfor 1995, the ratio of CEX to PCE expenditure shares on alcoholic beverages was a dismal 0.34. In addition, survey respondents often fail to accurately recall the volume or timing of some frequently purchased items: for example, âother entertainmentâ has a correspondence ratio of 0.37, and âmiscellaneousâ has a ratio of 0.24. For a number of other categories, such as furniture or appliances, it is less clear why expenditure weights differ as sharply as they do between the PCE and CEX. (The ratio of CEX to PCE weights for the âhousehold furnishings and equipmentâ expenditure category was 0.65-0.66 for the 1992-1995 period.) Here the problem may involve the PCE as much as the CEX. Businesses and governments buy furniture and appliances, but do not nec- essarily report (or categorize) these purchases in their accounting systems in a consistent way that allows them to be accurately identified and reported. The fact that other items (e.g., books, televisions, sound equipment) that are purchased broadly by both households and businessesâand for which it makes no obvious

DATA COLLECTION FOR CPI CONSTRUCTION 255 sense for households to underreportâshow large weight differentials that might support this notion (Triplett, 1997). For other components, such as rent or auto purchases, for which reporting rates are known to be high, it is encouraging that the ratios of CEX to PCE weights are close to 1. There are ways in which the PCE data system appears more developed. The PCE has the advantage that it is based on large surveys of businesses (the most prominent being the Census Bureauâs Retail Trade Surveys) that generally keep careful records and that rely less on respondent memory than does the CEX. Triplett (1997:16) does note a âbirth biasâ in the establishment surveys that arises because there is no mechanism for bringing new businesses into the sampling frame quickly. However, data from the censuses of manufacturers, retail trade, and service industries allow PCE component weights to be revised periodically and benchmarked every 5 years, which surely corrects some reporting and other biases. The benchmarking resets the allocation of purchases by commodity among business, government, and households and updates commodity lists. Further- more, the BEA methodology for keeping track of inputs and outputs includes cross-checks that impose consistency on the data. A major advantage of the CEX weights is that they are derived directly from reported household expenditures. One benefit of this direct reporting is that it allows household characteristics to be linked to expenditure information and, in turn, subpopulation indexes such as the CPI-E and CPI-W to be calculated. To produce the PCE weights, business and government spending must be subtracted out of sales data. Thus, the PCE is an indirect measure, calculated residually as final goods and services minus purchases made by nonconsumer sectors. Triplett (1997:16) notes that it is especially difficult to calculate consumption shares at more refined item levels because sales to consumers are not always distinguish- able from sales to businesses and government: âThe finer the level of detail, the more likely that the long chain of computations necessary to reach the CPEâs indirect estimate of consumer spending will have cumulative errors that affect the totals.â Even so, it seems implausible that estimates of business purchases of consumers goods could be off by enough to generate the kind of ratios between NIPA and CEX weights that are now produced. Difficulties associated with separating business from consumer purchases are compounded by the fact that the PCE covers a wider scope of goods and services than does the CEX. For instance, PCE coverage includes elements of government consumption, such as Medicare and Medicaid, the employer-paid portion of medical insurance, financial services, expenditures by nonprofit insti- tutions, and the value of certain goods and services received in kind by house- holds (Clark, 1999). As discussed throughout this report, the CPI currently covers only out-of-pocket expenditures by urban households. All told, about 25 percent of PCE spending is not reflected in the CPI. This, in itself, redistributes expendi- ture shares substantially. For instance, the medical care category (since it is not limited to out-of-pocket expenditures) gets a much higher weight in the PCEâ

256 AT WHAT PRICE? 17.6 percent for 1998âthan it does in the CPIâ5.6 percent for 1998. Also, not all items are defined comparably: in the CEX, for example, expenditures on new cars net out any amount paid for a trade-in vehicle; the PCE tracks the gross amount paid for vehicles, and trade-ins are not taken into account (Clark, 1999). This imperfectly matched expenditure classification creates a major hurdle to producing a PCE-weighted CPI, though Branch (1994) was able to make adjustments to reduce the noncomparability. For instance, utilities in the PCE can be combined with rent in order to match rent as it is defined in the CEX. But the following categories could not be reconciled for purposes of comparison: home- owner shelter (ownersâ equivalent rent, as we noted earlier, has recently ac- counted for around 20 percent of expenditures in the CEX and only around 11 percent in the PCE), capital improvements, health care, insurance, and finance charges (Branch, 1994). Assuming that the basic CPI item structure will remain as is, it is not clear how this problem should be resolved. To maintain the current CPI scope, the additional PCE entries would need to be backed out. Furthermore, BEA actually uses CEX data to estimate expenditure for a small number of commoditiesâpersonal computers, vehicle rentals, day care (Triplett, 1997)â which is another reason why moving to PCE weights might not allow the CEX to be eliminated. On the basis of available evidence, it is unclear whether PCE or CEX weights are superior. What is clear, though, is that for some components the two systems produce very different results. The major hurdle inhibiting comparisons among indexes weighted using alternative source data is the lack of uniformity in the scope and definition of goods and services covered. It is an open question as to how accurately expenditure categories can be mapped from the PCE to the CEX. We are not in a position to advocate one set of weights over the other, but the question certainly warrants further investigationâand this is what we recom- mend in the final section of this chapter. Frequency The CEX is used by BLS to determine the base period household expendi- ture shares for each of the 11,772 basic CPI strata. The CPI has traditionally determined these quantities from a 3-year span of CEX data; current weights reflect expenditure shares calculated from the 1993-1995 surveys, with immedi- ately prior weights based on the 1982-1984 surveys. In 1998 BLS announced that it would update and apply 1999-2000 expenditure weights effective January 2002 and revise these weights every 2 years, instead of roughly every 10, as has been its prior practice (see the âTechnical Notesâ at the end of the chapter for addi- tional details about the CEX). To accomplish this objectiveâwhich necessitates combining only 2 years of survey data instead of 3 and increasing the per-year number of basic CPI strata for which quantity information is obtainedâand to maintain roughly the current level of statistical accuracy, the sample must be

DATA COLLECTION FOR CPI CONSTRUCTION 257 increased by 50 percent. The recently requested increase in the CEX sample size from an effective annual sample size of 5,870 to about 7,500 per year approxi- mately does this. The decision to update CPI-U and CPI-W expenditure weights every 2 years beginning in 2002 was based on a tradeoff between timeliness and concern about âchain drift,â which can occur when the price indexes of non-identical items must be linked.3 BLS agreed with critics (such as Boskin et al., 1996) that the weights should be updated more often than every decade or so as in the past, but little theory or empirical evidence existed to provide guidance on the optimal fre- quency of updates. BLS chose to move to the more frequent end of the spectrum, every 2 years. There were some operational issues that argued for not updating every year. For instance, BLS reports that there is an advantage to having âoff yearsâ in which changes in CEX forms can be implemented without the time pressure of employing the data in the CPI. The main reason, however, was simply that the approach of updating weights every year, which would require overlap- ping 2-year CE weights, was untested and its statistical properties were unin- vestigated. BLS noted that, in its experience, changing index formulas can pro- duce unexpected and undesirable results, so it decided to err on the side of caution by not going to annual updating. Sample Size The CEX targeted sample sizes are 6,160 per quarter for the Quarterly Inter- view Panel Survey and 5,870 per year for the Diary Survey. Because an increased sample size will produce an increase in the precision of an unbiased estimate, recommendations to increase the CEX sample size (primarily directed at the Diary Survey) have tended to be of the âmore is betterâ variety. However if, as we have pointed out, the weights from the CEX are not unbiased, a decrease in sampling variability might actually increase mean squared error, which is what we ultimately care about. A recent report from the Conference Board recommended increasing the annual sample size of the CEX âperhaps initially to 30,000 households.â The 3Index (chain) drift refers to the possible bias that can arise when separate price indexes are linked. For example, suppose there are three periods, 0, 1, and 2. A price index could be computed for period 2 relative to period 0 in one step using fixed weights, or a âchain indexâ could be computed by multiplying the price index from 0 to 1 by the price index from 1 to 2. If each price is stochastic but stationary around a fixed level, or all prices are stationary around the same trend, so that relative prices vary in the short run but not in the long run, chain indexes are likely to be biased in compari- son with fixed-base indexes. If relative prices return to their period 0 value in period 2, the chained index will generally differ from unity; this difference is the chain drift.

258 AT WHAT PRICE? recommendation was supported with the following comments (Conference Board, 1999:18): This [the currently proposed increase in the sample from 11,000 to 15,500] seems inadequate. The comparable Canadian survey, to cover an economy only one-tenth our size, is reported to cover 16,000 households. Sampling theory shows that optimum sample size is not proportional to the overall size of the economy. Nevertheless, we feel that a larger U.S. Consumer Expenditure Sur- vey will achieve a worthwhile gain in the accuracy of the weights used in the CPI. A larger [CEX] sample will also improve other important statistics, such as poverty data. While the intuition of the members of the Conference Board group might be right, their recommendation would have been weightier had they been able to offer a statistically based explanation of how they concluded that the benefit of increasing the sample to 30,000 consumer units would be âworthwhileâ and that the value of gains in accuracy from increasing the sample would likely offset the cost of expanding the CEX.4 Likewise, no indication is given as to why and by what measures, in the groupâs estimation, the planned 15,000-15,500 household sample is unacceptably inaccurate. The Conference Board did note two factors underlying its recommendation that the CEX should be larger: that it would be valuable for improving other national statistics (besides the CPI) that are based in whole or in part on the CEX (e.g., poverty and savings rates) and that a larger national sample is needed to calculate subgroup indexes. In a somewhat more careful statement (but one lacking a specific recommen- dation), Triplett (1997:15) writes: . . . The [CEX] sample size (5,000 consumer units) is certainty too small for almost any use for which one wants consumption data. . . . The recently an- nounced increase from 5,000 to 7,500 [CEX] consumer units is a positive, but grossly insufficient, step. . . . The [CEX] is the federal governmentâs only general purpose survey of consumer expenditure. . . . For comparison, the Ca- nadian consumer expenditure survey will soon have a sample size of 36,000. He adds: . . . The [CEX]âs small sample size and lack of a benchmarking statistic means that its estimates for smaller components (e.g., household textiles) particularly are not as reliable as one would want for serious research on consumption. Also, the weights for the individual 207 basic components of the CPI are not 4The report is a bit unclear in its references to sample size. One can infer though that the 30,000 figure refers to the desired sample size to be used by BLS when it uses 2 years of CEX data to establish expenditure-base quantities; if so, the Conference Board is really recommending an in- crease from the current annual effective CEX sample size of 5,870 to double that of the proposed 7,500, to 15,000 per year.

DATA COLLECTION FOR CPI CONSTRUCTION 259 determined accurately from a [CEX] of only 5,000 consuming units, although it may also be true that the variance imparted into the overall CPI may be small. Increasing the CEX sample size would also enhance its ability to support other potential uses. For instance, the current sample is not large enough to verify trends among population subgroups needed for a consumption-based poverty measureâespecially for specific regions. Also, because it is the only U.S. survey that generates income and consumption microdata together, it is important for research on household savings behavior and on how that behavior varies along with age, income, and other factors. Other agencies (the Congressional Budget Office and the Department of the Treasury) use the data for modeling tax rev- enues and other research purposes. Better data would certainly improve research prospects in these areas as well. Finally, in defense of BLSâs request for the approximately 50 percent in- crease in the CEX sample size, Commissioner Abraham testified (Bureau of Labor Statistics, 1998) that the increase âwould let us produce superlative mea- sures to a degree of precision comparable to the precision of the current CPI. . . . We currently use three yearsâ worth of data in producing the market basket weights for the CPI. For the superlative measures, you use two yearsâ worth of data, so with a 50 percent increase in sample size, we would have about the same precision in the weights.â But there is no reference to the targeted level of accu- racy, nor of the impact of the increase in sample size on the precision of the current CPI computation. The commissionerâs statement merely says that the increase in sample size will enable BLS to estimate a CPI with the similar vari- ance characteristics as those of the current CPI computation. Given the current state of assessment, it is difficult to offer recommendations about the sample size of the CEX. The most pressing practical issues require weighing the cost of expansion against the advantages that changing the survey sample size would have on the accuracy of expenditure weights and, in turn, the relationship between weight accuracy and index variance. The panel carried out some analysis on this issue. The variance of the CPI, reflecting both the variance of the aggregation weights from the CEX and the price relatives from the C&S survey, is referred to by the BLS as the unconditional variance. The variance of the CPI, reflecting only the variance of the price relatives from the C&S survey (and treating the aggregation weights from the CEX as constant), is referred to by the BLS as the conditional variance. The ratio of the unconditional to the condi- tional variance is of the form quadratic function of aggregation weight cov mx q 1+ = 1+ . conditional variance of index c Thus, the impact of increasing the sample size of the CEX by a factor of f will result in a change in the ratio of unconditional variance to conditional variance to

260 AT WHAT PRICE? q/ f 1+ . c Let t be an observed ratio of unconditional to conditional standard error. Then the impact of increasing the sample size of the CEX by a factor of f on t will be to reduce t to t2 â1 1+ . f Leaver and Valliant (1995:Table 28.2) provide ratios of the median unconditional to median conditional standard errors across time for major item groups for the period January 1987 through December 1991. Our panel received an update of these ratios for 1998 and 1999 for the âall itemsâ and eight major item group series (Leaver, private communication). These ratios range from essentially 1.00 to a high of 1.23 (for the 1998 apparel index based on a 12-month price change). The all-items ratio ranged from 1.03 for an index based on a 1-month price change to 1.09 for an index based on a 12-month price change. To see the impact of an increase in the sample size of the CEX, consider an extreme caseâin 1998 the 12-month price change in the apparel CPI had a conditional standard error of 0.00811844 and an unconditional standard error of 0.00997372 for a ratio of 1.22853. Doubling the CEX sample size would have reduced this ratio to 1.120107. The apparel CPI went from 131.6 in December 1997 to 130.7 in December 1998. The 95 percent confidence interval for the December 1998 apparel CPI, based on the 12-month change from December 1997, is (128.1, 133.3). The comparable confidence interval, based on a doubled sample size, would be (128.3, 133.1). The panel therefore concludes that there is little evidence to support the recommendation to double the sample size of the CEX. All this speaks to the index as a whole. One might also want to study the effect of increasing the CEX sample size on the variances at the basic CPI strata level. Acceptable (and optimal) error and variance levels must be defined specifi- cally for the types of indexes that are desired; only then can they be evaluated against the cost of expanding the survey.5 In other words, one needs to determine the appropriate level of disaggregation at which to assess the effect of a change in sample size of the CEX. 5Specifically, one needs to know the range of sampling and nonsampling errors for different index components. Nonsampling errors are caused by failure of respondents to understand survey defini- tions, their unwillingness to provide correct information, collection and response errors, and a num- ber of other sources. Presumably, sampling errors can be reduced to a much greater extent by increasing sample size than can nonsampling errors.

DATA COLLECTION FOR CPI CONSTRUCTION 261 The following list summarizes research that should be taken into account before BLS statisticians can definitively target an efficient sample size for the CEX: â¢ Accuracy. As discussed in the previous subsection, it makes little sense to pursue more precise estimates of a biased measure. Differences in expenditure shares estimated by the CEX and PCE must be better understood and (at least partially) reconciled. â¢ Precision level. If it is established that CEX is the best option for setting expenditure weights, BLS should establish precision requirements for the expen- diture weights. The requirements must be informed by an understanding of how precise the CPI needs to be in terms of estimating the level and trend of the index. A primary driver for the sample size would be the extent to which population subindexes are desired. Precision requirements must also be established for other important uses of CEX as well, which may also have demographic or geographic dimensions. â¢ Cost of expanding the survey sample. The cost of CEX operations should be examined in relation to survey size and design characteristics. BLS and the Census Bureau have a fairly accurate idea of how much it costs to expand the sample size (they now have the experience of the 50 percent increase). In addition to sample size, there may be a considerable clustering effect (both in terms of statistical performance and cost) in the CEX. What is the optimal scheme for clustering surveyed households and designating sampling units? Also, since the survey has many uses other than for BLS weighting, evaluations should consider whether BLS should bear the full budget burden of future changes to the CEX; a cooperative effort shared by the Office of Management and Budget, BEA, the Federal Reserve Board, etc., may be more appropriate. â¢ Value of CEX. To redesign the CEX, or to expand its sample size, one needs to place a value on the inventory of all of its key uses. We know the CEX data are fundamental to the CPI. All uses, including the CPI, must be considered in making recommendations about the design or size of the CEX. Other Issues In addition to questions of frequency, sample size, and accuracy, there are a number of additional issues that involve assessing the information content of questionnaires and the general structure of the CEX. Many of the issues have already been addressed to varying degrees by the BLS and others. Improving the CEX will involve continued assessment of the effectiveness of the interview and diary survey approaches, what methodologies minimize underreporting of pur- chases or attrition from a diary panel, the appropriate universe of households and goods and services to be covered, and the role of incentives programs in increas- ing survey accuracy and reducing nonresponse. It will also require answers to

262 AT WHAT PRICE? questions about how the mode of data collection might be modified to take advantage of new computer-based data collection methods, whether all expendi- tures for all item categories should be collected from all households surveyed or just some from each, and what processing system is required for the CEX in order to expedite development of a superlative index. Answers to all three types of questions hinge on the types of indexes that BLS will be called on to produce. For instance, there are increasing demands for subpopulation and geographic (both price level and price change) indexes. Rec- ommendations for modifying the CEX can only be reasonably determined after the BLS and policy makers decide the importance and value of calculating these special-purpose indexes. Assuming different expenditure weights apply to each, a much larger CEX sample will be required. The Point of Purchase Survey A second major survey input to the CPI is the POPS, which is used to determine which outlets BLS data collectors will visit in the C&S survey to record actual prices.6 The POPS produces outlet-specific expenditure informa- tion for item categories so that a sample of those outlets can be selected with a probability proportional to consumer use. The POPS is needed because the CEX does not ask consumers where they purchased goods. In addition to its role in selecting outlets to which BLS agents go to price specific items, POPS expendi- ture data are also used to implicitly assign quantity weights to all items priced within a single item strata (see Cage, 1996:fn. 14 for details). Within the current data support system, the POPS data have been improved in terms of their effec- tiveness at identifying outlets where households shop and as an input for averag- ing price quotes within CPI item cells. The entry-level items (ELIs) in the CEX are not isomorphic with the POPS categories. Thus, some concordance and other adjustments are necessary to match the quantities from the CEX with the prices and price relatives determined from the C&S survey driven by POPS. In a newly designed data system, it seems likely that this mismatch could be eliminated. There is a substantial overlap between POPS and CEX. If the CEX had no use other than to provide upper-level weights for the CPI, it would make sense to redesign POPS so that it would be the survey vehicle to perform this function as well. This change would then allow for greater index design flexibility, but it would probably increase the sample size required in POPS and also increase the response burden for each participating household. 6The POPS provides sample outlets covering items that account for about 72.5 percent of the CPI (as measured in expenditure shares). A housing survey is used for shelter components of rent and ownersâ equivalent rent, and other sources are used for a few other commodities and services (see the âTechnical Notesâ at the end of the chapter for additional information about the POPS).

DATA COLLECTION FOR CPI CONSTRUCTION 263 The Commodities and Services Survey: Outlet Pricing The number of price quotes that are collected is determined at the ELI and index-area level in a process called sample allocation. The stated objective of sample allocation is to produce the most accurate national-level all-item index possible, given the budget constraints. Through this process, item strata in each area are assigned a minimum number of price observations. In practice, this means that sampling rates are dictated, and will be higher, for ELIs that represent a large expenditure weight or display high price variability, as is the case with such items as apples and bananas (Lane, 1996). The CPIâs C&S is a longitudinal survey that tracks changes in price quotes for most CPI-sampled consumer items over time.7 A few price quotes come from other sources: for instance, the CPI housing survey performs the same function for the shelter category. As described in Chapter 5, the specific items for which (and outlets from which) the C&S samples price quotes are rotated simulta- neously. The POPS provides the sampling frames for outlets by producing esti- mates of expenditures for items in specific POPS categories (corresponding roughly with strata) at specific outlets. Based on POPS results, specific ELIs are assigned to each sample outlet. Each ELI has a checklist of product specifications so that a BLS field agent can identify specific items from the ELI category that are sold at the selected sample outlet. Field agents select a unique item (from within the preselected ELI category) for pricing based on a probability distribu- tion of sales, with high-expenditure items (within that outlet) being more likely to be selected than low-expenditure items. The process whereby an agent narrows down the list of potential items from the ELI group to a specific item is called disaggregation (see Lane, 2000:9, for details). After a unique item is selected, the agent returns to the same outlet every month (or, in some cases, every 2 months) to record the price change. This process is repeated as long as the outlet continues to sell the item or until the outlet is rotated out of the sample. If the item is permanently discontinued, the agent consults a âcharacteristicsâ checklist and determines the most comparable replacement to price. As discussed elsewhere in the report, problems may arise with this pricing systemâfor instance, if an item is first priced when it is on a special sale or when a specific item remains on store shelves long after a large reduction in its market share. BLS continues to explore methods for improving the quality of price data. The most visible experimental activities involve expanding the use of electronic data, which may offer such advantages as larger samples, reduced variances, more accurate determination of in-store sales shares, more timely publication of superlative indexes, and the potential to use unit pricing. 7This paragraph summarizes the description of the C&S survey from Lane (2000).

264 AT WHAT PRICE? ALTERNATIVE DATA COLLECTION APPROACHES Since most options for improving CPI input data, particularly those involv- ing the household surveys, are expensive, and because there is methodological inflexibility under the current system, it is worth considering entirely new data alternatives. Of course, any net benefit of these alternatives hinges on exactly what types of indexes are desiredâCOLI or fixed-basket, national or regional, plutocratic or democratic, aggregate or subgroup. Other than the PCE-based ex- penditure weighting possibility, the two most obvious options for breaking from the current data system involve (1) combining POPS and CEX into an integrated survey that contains expenditure and outlet-use data at detailed prod- uct levels, along with household demographic information needed for subgroup indexes; and (2) moving toward scanner-based collection systems, which could be used to improve the existing surveys or as a component of an alternative. Current experimentation by BLS using scanner data illustrates its potential within the existing framework. Integrating scanner data as part of a POPS/CEX com- bined survey, or into a comprehensive household-based pricing system, would entail more radical shifts in CPI methodology. One advantage of restructuring the entire data support apparatus would be that it could be designed to fulfill current indexing needs. However, as the envi- ronment and uses of the index change, even such an optimal data system moves toward obsolescence unless it is much more flexible than current systems. In this section, we examine some approaches to improving the data support system under the assumption that radical changes are one option. An Integrated CEX/POPS Survey The CEX and POPS were introduced at different times and evolved out of different needs in an uncoordinated way. The CEX was developed to provide detailed data on household-level expenditure patterns. BLS has been producing expenditure surveys in one form or another since the late nineteenth century; however their production was sporadic (usually not more often than every 10-20 years) in the early part of the century and was motivated by a range of different needs. The 1960-1961 survey was constructed with the primary purpose of revis- ing weights for the CPI and was not limited to urban wage earners, as had typically been the case with previous surveys. The 1972-1973 survey was the first to use the modern interview and diary components, and the sample was selected on a probability basis (Jacobs and Shipp, 1990). The POPS was introduced to provide information about where consumers shopâinformation not provided in the CEX nor from existing sources of business sales-level data. Also, existing lists were typically based on the Standard Industrial Classification (SIC) system, which is not concordant with BLS-defined ELIs. Because both CEX and POPS are household-based surveys, it is natural to

DATA COLLECTION FOR CPI CONSTRUCTION 265 consider the possibility of merging the two into a single survey. Intuitively, it seems there should be economies of scale in combining them, as well as advan- tages to having more complete records (both expenditure and shopping pattern data) for each household. While we do think this possibility is worth investigat- ing, there are many complicating factors. To begin with, the reference periods are now different for the two surveys. The quantity weights from the CEX require updating over a longer periodic cycleâformerly every 10 years, but now moving to every 2 years (without necessarily implying a change in the item structure every 2 years); outlet rotation weighting, based on POPS, is done every 4-5 years on average and, since POPS is a continuously rotating survey, a subset of items and areas is considered for change every quarter. Whether or not these are opti- mal frequencies has not been determined. It is possible that adequate rotation and weighting schemes could be produced from a single survey, but at present the issue remains largely unexplored. The level of item detail needed to obtain CPI item strata weights and to select outlets and ELI samples is also different in the two surveys. Since POPS asks about product expenditure in greater categorical detail, it is generally believed that it requires a larger sample size to produce accurate probability schedules. It is possible that a unified survey could partition respondents into two or more groups, with some being asked more detail than others (something akin to the census short and long forms). Respondent burden could also be reduced if each household continues to be asked only about a subset of CPI items. Defenders of the current system could also point out that a combined survey that generates expenditure, demographic, and outlet information concentrates respondent burden unnecessarily. Detailed demographic information is missing from the current POPS; outlet usage information and adequate sample size are missing from the CEX. A combined survey would likely entail greater demands on any given respondent, and the CEX is already considered one of the most burdensome government surveys. There is also a range of data quality issues that would require investigation. The CEX sample may be more representative of the population since it is based on samples drawn from census household files, not on random digital telephone sampling as is the POPS. Each CEX household also reports on a larger share of total household expenditures than does a POPS respondent. Further complicating the issue is the fact that the CEX is used for research and policy purposes other than the CPI. The most obvious advantage of the multisurvey data system now in place is thatârelative to the size of expensive consumer surveysâa large number of price quotes can be generated (and at a reasonable cost) for each specific item that is ultimately tracked by the CPI. This is because price data are not linked to specific households. Households provide just enough information for BLS to assign weights to broad item categories and to identify high-use outlets. If prices had to be gathered from households in the manner laid out in Chapter 8, the survey would presumably have to be much larger (than either the current POPS or

266 AT WHAT PRICE? CEX) to ensure an adequate sample of prices for each ELI area cell for the CPI. Yet the real advantage of a survey that links prices paid for specific items to the purchasing households is that, in principle, from such data one could calculate average prices paid for specific items by different household types. The big question is what size household sample would be required to support such an index or, more realistically, how big a sample would be needed to make an experimental pilot project work. This question is discussed in Chapter 8. Scanner Data In this section we outline how scanner data work and identify some potential operational and measurement benefits that may be gained by increasing their use; we also point out limitations. However, reflecting the panelâs charge, the primary emphasis is on how the use of scanner data (and electronic data in general) might allow greater conceptual flexibility when constructing price or cost-of-living indexes. The discussion comments on the extent to which current BLS research and experimental programs may affect CPI pricing procedures. The panel also assesses the value of incorporating scanner-based pricing methods within the context of its more general recommendations concerning the feasibility and ad- visability of pursuing a COLI approach. We first look at the potential of point-of- sale scanner data and how it could be used to improve data accuracy and price collection procedures. We then look at the more futuristic idea of household- based scanner data. Point-of-Sale Scanner Data The most obvious way in which scanner data could be used to support the CPI would be as a replacement for or supplement to the C&S survey of outlets. Scanners in retail outlet checkout counters record Universal Product Codes that identify specific products and their manufacturers. These data are collected, col- lated, and sold by two major producers of scanner data: ACNielsen and Informa- tion Resources, Inc. (IRI). A growing literature on the topic is beginning to provide an indication of the feasibility, as well as the benefits and drawbacks, of using scanner data in the production of price indexes. While academic researchers in both the United States and Europe have begun exploring how scanner data could be used to improve the statistical properties of price indexes, BLS has moved to the forefront on work in the area.8 Reinsdorf (1996) successfully constructed a basic item-level index for coffee using scanner data. Currently, the BLSâs ScanData initiative is producing 8See, for example, Richardson (2000), Bradley et al. (1997), and Reinsdorf (1996).

DATA COLLECTION FOR CPI CONSTRUCTION 267 indexes for breakfast cereal in the New York City area from data provided by Nielsen. To date, Laspeyres, Tornqvist, and geomean indexes have been pro- duced; Paasche and Fisher indexes are under consideration. The BLS team is moving to construct the index for broader geographic areas as well. As additional areas are added, they will use current CPI aggregation weights and Laspeyres formula (Richardson, 2000:11). Scanner data offer several potential advantages. First, such data could stream- line item pricing procedures. Using computer-captured scanner data could reduce the number of manual steps in the C&S survey required to produce subaggregate indexes. Scanner price data may replace or reduce the need to visit stores to price items. Second, scanner data could generate a more representative selection of items for pricing. Scanner data include the universe of products sold (at outlets that have scanner technology), whereas the current quote sampling method only records prices for a small fraction of items on store shelves. CPI price quotes are drawn from items at outlets made eligible by selection in the most recent POPS sample. Scanner quotes are available if the item has been sold during the pricing period. For the CPI, BLS collects prices for selected items whether or not they have been sold at the POPS-identified outlet. In contrast, transactions scanner data pick up volume of sales. Some stores also maintain files that drive the price identification system and indicate the shelf price of all items for some period, such as a week, whether or not they were sold. CPI outlet and item samples are rotated periodically, every 4 years under current practice. In contrast, since scan- ner data can include the universe of transacted prices at covered outlets, samples are refreshed continuously and new items appear in the data much more fre- quently. For the BLSâs ScanData geomean and Laspeyres test indexes, weights and item samples are updated each year on the basis of the previous yearâs expenditure patterns (Richardson, 2000). Third, scanner data could improve sampling accuracy. Scanner data have introduced new capacity to calculate highly accurate average prices for specific commodities. The large number of outlets and item price points associated with scanner data offer the potential to greatly decrease sample variance and improve data precision. As pointed out by both the Boskin commission (Boskin et al., 1996) and the Conference Board (1999), the high volume of scanner data would allow for production of indexes at finer levels of product detail. Additionally, scanners record actual transaction prices, not shelf prices at which transactions may or may not have taken place for the relevant period. These features may help certain data users, particularly those that perform industry studies or types of analyses where average price movement over fairly short periods is more relevant than shelf price at a given point in time. The tentative result of BLSâs ScanData New York experimentâwhich provides some evidence as to how far these scan- ner data may improve underlying data qualityâhave been quite promising. The indexes produced from scanner data have displayed less variability than the CPI

268 AT WHAT PRICE? sample price counterpart. For cereal in New York, the sample size of price quotes is more than 1,400 times the number in the traditional CPI data. However this translates into a reduction of standard errors, such an increase should create greater index precision. Though it was surmised (Richardson, 2000) that this would reduce the standard error of the cereal CPI by a factor of about 38, a careful study of these data by Leaver and Larson (2001) shows that the reduction in the standard error was by a factor of about 6. Fourth, scanner data could expand geographic coverage for the CPI. Nielsen compiles scanner data from all states and regions (except for Alaska). Data from nonmetropolitan-area outlets are also available. In contrast, the CPI uses data from only 87 metropolitan areas. Fifth, scanner data may allow more systematic data-cleaning procedures. Scanner data are more uniform and may be simpler to process for index use. However, data-cleaning rules used by ACNielsen or IRI are different from those at the BLS, particularly in how missing or erroneous prices are imputed. This would become an issue for any index that uses scanner data only for a subset of item categories, while traditional methods continue to be used for the remaining item categories.9 Finally, with scanner data, it will be possible to produce price averages (or unit valuations). Scanner data allow transaction prices to be averaged over the relevant period. Unlike BLS pricing methods, scanner datasets are typically pro- duced using aggregated unit valuesâa quantity-weighted average price of an item. The simplest version is calculated as sales revenue divided by number of units sold. Unit values are used in most basic item indexes in the world; however, this is not the case with the CPI, since a weight is assigned to each price quote (Richardson, 2000). This last issue, unit pricing, requires further discussion, since it is not trans- parent that it is conceptually superior to the current practice of pricing items on store shelves at a point in time. The main criticism of unit pricing is that it produces a price at which no single item may actually have been sold.10 On the other hand, the ScanData team argues that âthe unit value index more accurately 9See Richardson (2000) for a summary of how scanner data were cleaned for use in the ScanData indexes. 10This is the case because stores sell the same item at different prices, which then are averaged. Unit values may be the average of prices over a time period, across some set of outlets (like an outlet chain), or even across product codes that have only minor differences in characteristics. Multistore unit pricing implicitly accepts the assumption that consumers switch easily between outlets in re- sponse to price changes. One practical advantage is that chain- (in contrast to individual store-) level data are less expensive to produce. Commenters (such as Diewert, 1995) have expressed reservations about this pricing approach. In the ScanData cereal experiment, outlet- and âorganizationalâ- level data have been very similar.

DATA COLLECTION FOR CPI CONSTRUCTION 269 reflects the preferences of the shopper who searches out the lowest prices each week, and also the consumer who stockpiles during a particularly good special, but then purchases nothing until the next specialâ (Richardson, 2000:12). In some instances, few consumers purchase at the shelf price that the BLS agent happens to observe. How many people buy Chicken-of-the-Sea tuna fish when the Bumblebee next to it is on sale for half price? Feenstra and Shapiro (2001) cite marketing literature indicating that there is substantial consumer substitution across weeks in response to price changes and advertising. Also, their own data on canned tuna show a high degree of price variation and substantial response of consumer demand to that variation (Feenstra and Shapiro, 2001). Using shelf prices assumes rigidity in consumer shopping behavior, since items in each week of pricing are treated independently and that elasticity of substitution among them is zero (Richardson, 2000). Proponents of unit value pricing argue that is it better to consider purchases in different weeks of a month as purchases of the same good in the context of consumersâ utility. It is certainly worth noting as well that, at some level, price averaging must take place to construct any price index. Whatever the outcome of these specific questions, it is clear that scanner data allow researchers to look at all sorts of interesting things. They facilitate com- parisons of series that combine price data in different ways, including alternative index formulas, such as short time-lag superlatives. The ScanData team, for instance, was able to compute several indexes contemporaneously (using a Paasche construction as the lower bound with which to test other indexes). Addi- tionally, the sheer volume and detail of scanner data also facilitate hedonic analy- ses of quality change (such as Ioannidis and Silver, 1999). Even when scanner data are ultimately not used to construct an index, availability of the data can only advance the pace of research that leads to improvements in the index generally. Early results for the ScanData cereal test indicate that introduction of scan- ner data may have a significant effect on index performance. For the February 1998 through June 2000 period, cereal inflation for the New York metropolitan statistical area, as measured by the CPI, rose from (a re-based) 100 to 101.1. The geomean scanner index completed the series at 104.9. This 3.8 percent difference may have been attributable to several factors. First, the universe of outlets for the two indexes was not identical; ScanData was missing data from a wholesale club. There was also a sharp decrease in the regular CPI for cereal in October 1999 that did not appear in the scanner data and is difficult to explain. Also, the Tornqvist index rose more rapidly than did the geomean, indicating that, at least for cereal in New York, elasticity of substitution is less than 1.0, as assumed under the geomean method (Richardson, 2000). It is also important to assess the extent of practical advantages of scanner data that might add to the viability of its regular use. The ScanData experiment has produced favorable results in a number of areas showing, specifically, that:

270 AT WHAT PRICE? â¢ Scanner indexes can be produced on the current CPI schedule. Regarding quote timing, CPI and scanner data cover similar periods within the month; scanner data have the advantage of covering weekends and holidays, which CPI data do not. â¢ For many cases, scanner data cover the entire domain of products within any given item strata and area cell, which is important for methodological consis- tency. â¢ The scanner indexes can be produced in a manner generally consistent with BLS sampling procedures. â¢ The sample is rotated and can be refreshed at least as often as under current CPI practices. â¢ Indexes work with both standard geomean and superlative formulas. The cost implications of introducing scanner data and reducing field price obser- vations have yet to be fully evaluated by BLS. Limitations of Store-Based Scanner Data Despite the numerous potential advantages described above, issues remain to be sorted out before BLS can proceed toward systematic integration of point-of- sale scanner data into the CPI; these issues relate to pricing, coverage (both geographic and item-specific), cost, integration of scanner data with other data sources, and reliance on private-sector data. In addition to unit valuation (already discussed), pricing issues include treat- ment of taxes and comparability between private-sector scanner data and Census Bureau/BLS data. The CPI collects prices without sales taxes; then a calculated tax is applied separately using secondary data. Scanner data also do not include taxes. However, since ACNielsen does not disclose the exact location of outlets, it is not always clear what tax rate should be added to item prices. For the cereal experiment, it was not a problem since New York has no tax on most food items. However, in general, a solution to this problem needs to be found by vendors or BLS. One possibility would be to calculate a population-weighted average sales tax each month for each item based on the outlet usage patterns of consumers in each geostrata (Richardson, 2000). Coverage issues include geographic definitions and saturation of scanner equipment. Geographic-area definitions for the CPI and for currently produced scanner datasets do not match. Scanner data markets are generally smaller than the census-defined metropolitan areas on which the CPI is based. ACNielsen is currently working to map most of the United States into CPI geographic areas, though when the project is complete, there will still be some gaps (e.g., ACNielsen does not cover Anchorage). Even for the covered areas, scanner price data are not available for all outlets at which items from any given CPI strata are sold. In the

DATA COLLECTION FOR CPI CONSTRUCTION 271 cereal experiment, there were CPI quotes that were not included in the scanner universe (in this case, they were from mass merchandisers). Small mom-and-pop stores also frequently do not use scanner technology. Efforts are currently under way at ACNielsen to expand the depth of outlets covered in its datasets. Also, âmigratingâ quotes come into play when purchases are made across CPI areas. The POPS sample covers purchases in adjacent areas, but these patterns cannot be inferred from scanner data. In other words, the POPS covers purchases of consumers from a certain area while the scanner datasets cover purchases made by any household in a particular area, which is not the CPI objective. It may be possible to construct a scanner index as a weighted index from the areas in which consumers of a given area shop (Richardson, 2000), but this certainly adds com- plication back into the system. Scanner data coverage is most broad based for items sold in supermarket outlets, while there is virtually no coverage in service sectors. Hawkes and Piotrowski (2000:1) of ACNielsen report that 43 of the 211 CPI item categories can âin large measure, be represented through scanning data obtained from Su- permarkets, Mass Merchandisers, and Drug Stores.â These categories account for about 10 percent of all consumer expenditures and about 24.2 percent of expendi- tures for goods (excluding services such as rent). Item coverage constraints alone severely limit the impact that use of store-based scanners can have on the overall CPI. In terms of cost, the budget tradeoff between purchasing data from private vendors and traditional price data collection must be evaluated, as BLS is in the process of doing. Another issue concerns integration of scanner-based subin- dexes (possibly superlative) with traditional sampling-based item indexes: What are the statistical and index performance ramifications when subindexes are com- piled using different types of data? Finally, BLS currently does not have to rely on private outside sources for fundamental pricing data. The ramifications on CPI production of changing this must be explored. For instance, ACNielsen and IRI buy their data from chains, and at times chains decide to no longer sell these data. This means that, while a given store has a positive probability of being in the traditional CPI sample, its probability of being in the scanner dataset is zero. Thus, problems of continuity with the scanner data universe could arise. Household-Based Scanner Technology Household scanner technology could be adopted in one of three ways: it could be used to improve the accuracy and coverage of the current household surveys, particularly the CEX; it could also be used in a combined CEX/POPS survey; or, more ambitiously, it could be the technical centerpiece of a house- hold-based panel survey that produces both expenditure share and price informa- tion that would be used to produce household or subgroup indexes. Any plan to

272 AT WHAT PRICE? augment the CEX would require members of sampled households to use handheld scanners to report UPC items and quantities. These data could be enhanced by having the household members key-enter prices as well. BLS could develop scannable menu codes for non-UPC items, which the sampled household could then use to help enter quantities and prices. In addition to this information, household members could be asked to report the store name and address associ- ated with each purchase. One might bypass some of the household recording of prices if the reported store is one from which prices for UPC items can be obtained directly. Potential to Enhance Accuracy of the CEX and POPS Even before considering price issues, household scanner devices could in- crease the quality of current surveys by improving the accuracy of householdsâ documentation of purchases. It could produce more accurate and detailed weight- ing from the item strata to sub-ELI levels. Household scanning technology could help reduce errors associated with improper identification of products and prices and reduce recall and incomplete information (about location of purchases, for instance) biases. The technology creates greater breadth and depth of information by tracking product and buyer characteristics and offering more uniform geographic cover- age (rural areas, all age groups, etc.), thereby expanding the potential to develop subgroup indexes. It could cover purchases made at outlets that do not use point- of-sale scanning, and even in sectors that do not, if supplemental code sheets could be developed for respondents to scan. Lastly, household scanner technol- ogy may reduce respondent burden. The possibility also exists that new types of errors (e.g., keying) could be introduced; this possibility would have to be examined in pilot projects. Pilot projects would also be important for determining whether introduction of this technology into the survey affects the demographic composition of the sample (e.g., bias it away from inclusion of the elderly). Scanner Technology as a Tool for Moving the CPI Toward a COLI Independent of whether or not the CPI should be based on a COLI frame- work, scanner data may be used to help overcome a few of the obstacles that now preclude calculation of anything like a COLI. By providing simultaneous infor- mation on prices and quantities, scanner data may reduce the lag in the produc- tion of superlative indexes and also enable Paasche indexes to be produced. Under current practices, price and quantity data are produced from different samples and at different frequencies. Much caution is in order here, though. Feenstra and Shapiro (2001:21) found, in their construction of superlatives using scanner data for tuna fish that âthe

DATA COLLECTION FOR CPI CONSTRUCTION 273 calculation of conventional price indexes. . . shows substantial pitfalls of me- chanically applying price indexes to such data.â The superlative index is intended to capture reductions in the cost of living as consumers substitute goods that have decreased in price for those that have increased. However, the superlative index calculated by the authors fails to produce this result (the superlative index grew faster). Feenstra and Shapiro (2001:22) concluded: The consumer behavior that generates these data cannot correspond to the static utility maximization that provides the foundation for superlative index num- bers. Our tabulations suggest that the index numbers do not properly account for consumer behavior in response to sales. In particular, the chained Tornqvist gives too much weight to price increases that follow the end of sales. The authors go on to explain that their findings reflect purchases made for storage rather than immediate consumption. In other words, purchases and con- sumption do not track in a parallel fashion, particularly for items that can be stored. As such, the consumer does not face as much an increase in price (after sales) as the raw data imply. In addition, advertising contributes to the breakdown of the law of demand that is assumed under the superlative index approach: âIf advertisements cause consumers to purchase a [larger] quantity than would be consistent with static maximization of a time-invariant utility function, superla- tive index numbers will not accurately measure the cost of livingâ (Feenstra and Shapiro, 2001:22). On the basis of their findings, they conclude that unit values might provide a good approximation for construction of a COLI but should be adjusted to reflect consumption and should be adjusted to account for storage costs.11 Many of the general advantages of scanner data noted above may also help to address other CPI biases. For instance, scanner data allow for quicker and more accurate identification of both new goods and item attrition (and, as such, could have the capability to reduce new goods bias), as well as of outlet substitution patterns. Furthermore, scanner technology generates more detailed data for he- donic regression and other quality adjustment methods (although quality change bias is probably less of an issue for food itemsâthe potential may be greater in areas such as consumer electronics) and also produces empirical evidence that may allow researchers to estimate the impact of quantity (and other types of) discount pricing on index growth. 11Triplett (1998) provides a simple demonstration of several other problems with using high- frequency data to produce a chained superlative index.

274 AT WHAT PRICE? SUMMARY AND RECOMMENDATIONS Without the benefit of extensive research on each of the areas raised in this chapter, the panel cannot make many definitive recommendations with respect to the data inputs to the CPI. We recognize that the BLS has undertaken research projects in these areas, and so BLSâs inclusion in our discussion should not be taken as an indication that it has been negligent in its research efforts. It merely means that the panel recognizes the importance of these areas of research and hopes that they will continue systematically and thoroughly. Research into the accuracy and sample size of the CEX should be a high priority among topic areas relating to the data collection process for the CPI. The panel concluded that it is likely that CEX estimates of consumer expenditure shares are biased, perhaps seriously. There is no obvious benefit to increasing the survey sample size if nonsampling error dominates sampling errorâone would simply be achieving more precise estimates of the wrong thing. Recommendation 9-1: Before additional resources are directed to- ward increasing its sample size (beyond the current plan), the accu- racy of the CEX should be carefully evaluated. Assessing the net advantages of using the BEAâs PCE to produce the upper-level weights for the national CPI should be part of this evaluation. At the very least, research by BLS (and BEA) into the sources of divergence between PCE- and CEX-derived expenditure weights needs to be extended so that these differences can be more fully understood. Even if the current system is ultimately maintained, the effort will produce additional guidance about how the CEX might be improved. Recommendation 9-2: If categories can be reasonably well matched between the CPI and PCE, so that comparable item strata indexes can be created, a program should be set up to produce an experi- mental CPI that uses PCE-generated weights at the upper (218 item) level but that is otherwise no different from the CPI. If full item-by-item mapping turns out to be too problematic, it might still be possible to use PCE estimates for major item categories where the PCE and CEX have comparable coverage. For such categories, estimated totals from the CEX could be forced to equal the PCE estimates, which might allow the PCE to correct for undercoverage in the CEX in much the way that demographic projections are used to correct for undercoverage in household surveys such as the Current Population Survey. The distribution among lower-level aggregations would be determined by the CEX distribution. Investigating how well such experimental indexes perform seems especially sensible given the high cost of revamping the CEX survey or increasing its sample size. We would very much like to see a

DATA COLLECTION FOR CPI CONSTRUCTION 275 thorough defense of the choice of CEX-generated upper-level weights, relative to the alternatives, for the national-level CPI. The CPI data collection process would also benefit from research in several other key areas discussed in this chapter: â¢ Frequency of the CEX: a combined theoretical-empirical study of the impact on the CPI of the frequency of updating weights. â¢ Sample size for the CEX: a combined theoretical-empirical study of the effect of CEX sample size on the variance of the CPI and on any subindexes that are desired. â¢ CEX and POPS survey design: a comprehensive reexamination of the design of each of these surveys. â¢ Integration of CEX and POPS: a study of the feasibility and requirements of a for-CPI-use-only single survey encompassing both the CEX and POPS. Recognizing that scanner technology has the potential to improve the entire process of data collection for the CPI computation, the panel also identified the following key study areas: â¢ Point-of-sale scanner data and item selection: continuation of research on how these data can be used both to select items for pricing and to replace the C&S Survey and a quantification of the improvement in the CPI based on their use.12 â¢ Point-of-sale data and outlet selection: initiation of research on how to use store sales information based on scanning to determine the stores to be sampled in the C&S. â¢ Household scanner data: initiation of research on the use of handheld scanners to record UPC items and quantities along with key-entering prices and/ or store names and addresses. â¢ Integration of UPCs into BLS ELI framework: development of a concor- dance between UPCs and the ELIs. â¢ Integration of non-UPCs into BLS ELI framework: development of BLS assignments of UPCs for items which otherwise do not have UPCs for use in household handheld scanning. 12Assessment of current BLS scanner data experiments (ScanData cereal index for New York, next yearâs expansion throughout New England; use of scanner data/hedonics for audio components (using NPD data computers, and consumer electronics) to test impact on statistical properties of price data. We note that in 2002 BLS will consider ScanData recommendations about solving geography issues and about funding requests needed to expand the project and incorporate scanner-based sub- indexes into the CPI.

276 AT WHAT PRICE? â¢ Experimental development of subgroup indexes: performance of the household-based price data experiment, likely involving household scanner tech- nology, to produce subgroup indexes that capture variation in both expenditure weights and prices paid. TECHNICAL NOTE: ADDITIONAL DESCRIPTION OF CPI DATA INPUTS The Modified Laspeyres CPI The âTechnical Noteâ to Chapter 2 sets forth the mathematical derivations underlying the development of the recommendations in this report. Equation (1) of that section sets forth the Laspeyres price index PLt, namely N â qn0 pnt n =1 , PLt = N â qn0 pn0 n =1 relating base period quantities qn0, base period prices pn0, and current period t prices pn , for each of N goods (where the superscripts 0 and t refer to the base and current periods). The actual BLS-reported CPI differs from this in a few respects. First, the index is reported relative to a period in which it was set equal to 100. This period has, since 1987, been July-August 1983; prior to that, it was January 1967. Second, and of more critical importance, the above equation is based on the assumption that both prices and quantities are collected simultaneously in the base period, but this is not the case for the BLS-reported CPI. For the CPI, the base period quantities are based on data from a household expenditure survey, while the base period prices are based on data from the monthly pricing surveys. Since the quantity data take longer to compile than do the price data, what is instead calculated is a âmodified Laspeyres index,â namely N N â qn0 pnt â qn0 pna [ pnt / pna ] n =1 n =1 , PLt, a = N = N â qn0 pna â qn0 pna n =1 n =1 where, as before, n indexes the N goods and the superscript t denotes the current period, but where the superscript 0 refers to the quantity-base period (sometimes called the expenditure-base period) and the superscript a refers to the price- reference period. Since January 1998, the quantity-base period has been 1993-

DATA COLLECTION FOR CPI CONSTRUCTION 277 1995; prior to that it was 1982-1984.13 It is planned that, as of January 2002, the quantity-base period will be 1999-2000, and that it will be updated at 2-year intervals subsequently using information from the Consumer Expenditure Survey (CEX) ending 2 years prior to the update. Finally, the qn0 are themselves not directly observed in the household expen- diture survey. Rather, the survey provides quantity-base period expenditures en0 for item n, and the quantities qn0 are calculated by dividing en0 by pn0, where the quantity-base period prices are obtained from the monthly pricing survey. The CPI can be expressed as a multiple of a Laspeyres index and the recipro- cal of a modified Laspeyres index based on the quantity-base period and price- reference period, namely N N N â qn0 pnt â qn0 pnt â qn0 pn0 PLt n =1 n =1 n =1 PLt, s = N = N N = . PLs, 0 â qn0 pns â qn0 pn0 â qn0 pns n =1 n =1 n =1 As seen above, PLs,0is a constant that relates the quantity-base period to the price- reference period. The critical element of the index is indeed PLt, which can be rewritten as N N â qn0 pnt â qn0 pntâ1[ pnt / pntâ1 ] n =1 n =1 PLt = N = N â qn0 pn0 â qn0 pn0 n =1 n =1 N â qn0 pntâ1[ pnt / pntâ1 ] n =1 = PLt â1 N . â qn0 pntâ1 n =1 This index can be characterized as a âchainedâ index, where the previous periodâs index PLtâ1 is multiplied by a dollar-weighted average of price relatives, with the dollar expenditure weights being those of the quantity-base period quantities priced at the previous periodâs prices and the price relative taken with respect to the price in the previous period. One should note that what is reported monthly by BLS is the period-to-period index, namely PLt/PLtâ1. 13The quantity-base period differs across items so that, strictly speaking, 0 should be subscripted as 0n, with the specific month for item n depending on the month 0n in which the qn0 are determined from the Consumer Expenditure Survey.

278 AT WHAT PRICE? Elements of the Index and Subindexes The âgoodsâ used in the CPI are organized into expenditure classes (ECs); as of 1999, there were 68 ECs. These are in turn are subdivided into item strata; as of 1998, there were 218 strata. Finally, the item strata are subdivided into entry- level items (ELIs); as of 2000, there were 282 ELIs.14 The following is an ex- ample of this hierarchy of goods (Bureau of Labor Statistics, 1997a): Expenditure Class 24: Maintenance and repair commodities Item stratum 2401: Materials, supplies, equipment for home repairs Entry-level items: 24011: Paint, wallpaper, and supplies 24012: Tools and equipment for painting 24013: Lumber, paneling, wall and ceiling tile; awnings, glass 24014: Blacktop and masonry materials 24015: Plumbing supplies and equipment 24016: Electrical supplies, heating and cooling equipment. Item stratum 2404: Other property maintenance commodities Entry-level items: 24041: Miscellaneous supplies and equipment 24042: Hard surface floor covering 24043: Landscaping items Subsequently, the ECs and their components were redesignated; in the 26 March 1999 list of ELIs, EC24 has been restructured as: Expenditure Class HM: Tools, hardware, outdoor equipment and supplies Item stratum HM01: Tools, hardware, and supplies Entry-level items: HM011: Paint, wallpaper tools, and supplies HM012: Power tools HM013: Miscellaneous hardware, supplies, and equipment HM014: Nonpower hand yools Item stratum HM02: Outdoor equipment and supplies Entry-level items: HM021: Powered lawn and garden equipment and other outdoor items HM022: Lawn and garden supplies and insecticides The data used in the CPI are collected in 87 primary sampling units (PSUs; see Williams, 1996). The data are aggregated into 54 basic areasâ34 self-repre- senting areas (e.g., Kansas City, MO-KS) and 20 region- and population-size 14Data from BLS dictionaries and Dennis Fixler (BLS, personal communication, 2000).

DATA COLLECTION FOR CPI CONSTRUCTION 279 cross classifications (e.g., Midwest Size A).15 The basic areas and item strata combine to form (218 Ã 54) = 11,772 basic CPI strata. Note that each of these basic CPI strata may be comprised of more than one ELI and more than one PSU. Let h index the basic areas (h = 1, . . . ,54) and z index the item strata (z = 1, . . . . , 218). Until January 1999, BLS calculated Rthzâan estimate of the relative price change in basic area h, item stratum z, from period t â 1 to period tâusing the formula when the samples of items within the item strata are selected with each unit having a probability proportional to quantity, or the formula â whi phit t = iÎµz Rhz â whi phitâ1 iÎµz â whi phit / phia t = iÎµz Rhz â whi phitâ1 / phia iÎµz when the samples of items within the item strata are selected with each unit having a probability proportional to expenditure. In both forms the weights whi reflect the probability that item i in item stratum z is selected to be priced in basic area hâin the first of these the weights whi are essentially qahi /Ïc; in the second the weights whi are essentially pahi qahi /Ïhi, where Ïhi is the probability that item i in item stratum z is selected to be priced in basic area h. Since January 1999, they have replaced this computation for most indexes (the housing index being the most notable exception) with a weighted geometric mean, namely t = Rhz â ( phit / phitâ1 )whi . iÎµz When one can obtain prices in basic area h for the universe of items in item stratum z, for both time periods t â 1 and t, then Rthz is given by the weighted average t = Rhz â whi ( phit / phitâ1 ) iÎµz or, if the geometric mean computation is used, is given by t = Rhz â ( phit / phitâ1 )whi , iÎµz where whi is the ratio of the expenditure in basic area h on item i of item stratum z to the expenditure in basic area h on all items of item stratum z. Since a census of the prices for the universe of items in item stratum z is impractical, BLS 15This count is based on the 1998 CPI item strata spreadsheet provided by Dennis Fixler (personal communication, 2000).

280 AT WHAT PRICE? estimates the Rthz. An oversimplified version of the BLS procedure is the follow- ing: Let a sample of N items be drawn from the universe of items in item stratum z (with replacement), with the probability of selection of item i equal to whi. Then N 1 Rhzt = N â ( phjt / phjtâ1 ) j =1 is an unbiased estimate of the weighted average version of Rthz, and N Rhzt = â ( phj t / p t â1 )1 / N hj j =1 is a consistent estimator of the geometric mean version of Rthz. The BLS then updates its index Pthz for this basic stratum by the chaining formula described earlier, namely, Ihzt = Ihztâ1 Rhzt . These indexes are aggregated to form indexes for aggregate areas (e.g., U.S. cities), aggregate items (e.g., expenditure classes), or both. Let H denote the aggregate area and Z the aggregate item for which an index is to be formed. The index for this aggregate area and item is calculated as â â Ahz Ihzt hÎµH zÎµZ t I HZ = , AHZ where Ahz = â phj0 qhja / Ihz 0 jÎµz and AHZ = â â â phj0 qhja / Ihz0 , hÎµH zÎµZ jÎµz where jÎµz denotes the items drawn from the universe of items in item stratum j. Consumer Expenditure Survey The CEX, sponsored by BLS and conducted by the Bureau of the Census, is a national probability sample of household units. It is comprised of two parts, a Quarterly Interview Panel Survey and a Diary Survey. Each âconsumer unitâ in the household selected for the Quarterly Interview Panel Survey is interviewed for 5 consecutive quarters about relatively large expenditure items (e.g., major appliances) and expenditures that occur at regular intervals (e.g., utility bills). A sample of 8,910 addresses are contacted for the Quarterly Interview Panel Survey in each of the calendar quarters, and the number of completed interviews per quarter is targeted at 6,160. Each consumer unit selected for the Diary Survey completes a diary on expenditure information on frequently purchased items and

DATA COLLECTION FOR CPI CONSTRUCTION 281 relatively small expenditure items for 2 consecutive weeks. A sample of 8,020 addresses are contacted each year to participate in the Diary Survey, so the effective annual sample size participating in this survey is 5,870 households, spaced across the 52 weeks in the year. The CEX has many uses in the govern- mental statistical framework. Its primary use in the CPI computation is to con- struct the quantities qhz0 which underlie the CPI computation. It has also been used âto select new market baskets of goods and services for the index, to determine the relative importance of components, and to derive new cost weights for the basketsâ (U.S. Department of Labor, 2000). Point of Purchase Survey and Commodities and Services Survey The goal of the Point of Purchase Survey (POPS) is to determine the prices to be used in the CPI computation. The first stage of this survey is a national probability sample of household units, conducted by the Census Bureau, whose primary aim is to define the outlets to be sampled to obtain price data. The survey began in 1978 as a personal interview (and was referred to as CPOPS, for Con- tinuing Point of Purchase Survey). In 1999 BLS revised this survey as a tele- phone interview, referred to as TPOPS (for Telephone Point of Purchase Survey). CPOPS was conducted annually over a period of 4 to 6 weeks, usually beginning in April; TPOPS interviews households every quarter. In CPOPS approximately one-fifth of the PSUs were sampled each year; the goal in TPOPS is to increase this sampling rate so that one-fourth of the PSUs are sampled each year. All consumer units in the selected household are asked to recall whether or not they purchased categories of goods and services within a specified recall period (vary- ing from 1 week to 5 years, depending on the purchase cycle of the category) and, if so, the expenditure amounts and the names and locations of all places of purchase. Based on the responses to this survey of household units, a frame of outlets is defined for outlet selection. Since approximately one-fourth of the PSUs are currently sampled each year, after the survey of household units the frame of outlets determined by the survey is unchanged for 4 years. The commodities and services are grouped into POPS categories, consisting of combinations of some of the ELIs; there were 174 POPS categories in 1997 (Bureau of Labor Statistics, 1997a). For example, POPS category 127, materials and supplies for major home repairs, consists of two of the ELIs of item stratum 2401, ELIs 24013 and 24014. POPS category 129, hardware items, hand tools, and other materials for minor home repairs, contains the other four ELIs of item stratum 2401â24011, 24012, 24015, and 24016; it also contains ELI 24041, miscellaneous supplies and equipment; ELI 32043, other hardware; and ELI 32044, nonpowered hand tools. For the purpose of outlet selection, the BLS has aggregated the POPS cat- egories into eight categories and the PSUs into ten groups (see Bureau of Labor Statistics, 1997a:). After a PSU group has been surveyed, the ELIs to be priced

282 AT WHAT PRICE? for the C&S Survey are selected with a systematic sampling procedure, with probability of selection proportional to the amount of expenditure in that PSU group and its item stratum. This systematic sampling procedure guarantees that over the 4-year period each of the ELIs will be selected for pricing. The outlets actually sampled from each frame are selected independently for each PSU group and POPS category, with probability of selection proportional to the amount of expenditure in that PSU group and POPS category. To give readers a sense of the number of outlets selected, the largest number is nine, in the POPS foods and beverages category, PSU group Philadelphia. At a selected outlet a BLS field representative uses a multistage probability selection procedure for selecting the specific item to be priced among those that the outlet sells that fall within the designated-for-pricing ELI definition. The probability of selection is, if the information is available, proportional to the sales of the items in the ELI groups. Otherwise it is either based on the proportion of shelf space or, as a last resort, assigning equal probability to each item. Once the item is selected, its price is recorded. These are the prices that are weighted and used in the computation of the Rhzt used in the CPI computation.

Next: Appendix Statistical Definition and Estimation of Price Indexes »

At What Price?: Conceptualizing and Measuring Cost-of-Living and Price Indexes (2002)

Chapter: 9 Data Collection for CPI Construction

Welcome to OpenBook!

Get Email Updates