Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Appendix Statistical Definition and Estimation of Price Indexes T his report addresses foundational economic concepts for cost-of-living or price indexes. In the panelâs view, the concepts must reflect the reality of the marketplace; they must capture the change in real prices paid by real consumers. The concepts must be measurable in the context of a system of surveys and other data collection activities that the Bureau of Labor Statistics (BLS) can feasibly implement. An important step in assessing the measurability and reality of a particular price index concept is to express the concept statistically in the form of a popula- tion parameter to be estimated. If one can write down the parameter, one can examine the feasibility of surveys and other data collection activities necessary to produce accurate statistical estimators of the parameters. One can also examine whether the parameter is defined in terms of the prices actually paid by consum- ers. In what follows, we translate our concepts into explicit population param- eters. We define the price indexes motivated by our concepts and demonstrate briefly the survey data required to estimate the indexes. To begin, consider a simple world in which there is only one good and two time periods, base and comparison and a static universe of households (HH), denoted by the set H. For cases in which it would be better to work in terms of subgroups within HHs called consumer units (CU), let H denote the universe of CUs. 283
284 APPENDIX Next, let us introduce the bulk of the requisite notation. Let i signify the HH (i = 1, . . . N), j the purchase occasion, Ji0 the set of purchase occasions by the ith HH in base period 0, Jit the set of purchase occasions by the ith HH in comparison period t, Qgij0 the number of units of good g purchased by the ith HH, jth purchase occasion, during base period 0, Qgijt the number of units of good g purchased by the ith HH, jth purchase occasion, during comparison period t, N the number of households in the universe pgij0 price per unit (of good g) paid by the ith HH, jth purchase occasion, during base period 0, and pgijt price per unit (of good g) paid by the ith HH, jth purchase occasion, during comparison period t. In these definitions, we use the convention â =0 j âJi 0 for nonbuyers in the base period and â =0 j âJit for nonbuyers in the comparison period. We assume there is at least one buyer, Q0 > 0 and Qt > 0, in each period. Average unit volumes, Q0 and Qt, and average prices per unit, p0 and pt, are defined in the obvious way. The decomposition of the period-to-period trend in total dollar volume is now given by Yt Ty = Y0 Nt Qt pt = N0 Q0 p0 = TN Tq Tp , where TN is the trend in the total HH count, Tq is again the trend in average units per HH, and Tp is again the trend in average price per unit. As above, Tp may be called the price index and Tq the unit volume index. We can next further extend the work to a still more realistic world in which a static set of goods is available in the market at both time periods. Let subscript g signify a good, and to simplify the notation let G represent both the set and the number of goods. Total dollar volumes are now defined by
STATISTICAL DEFINITION AND ESTIMATION OF PRICE INDEXES 285 Y0 = ââ â Qgij 0 Pgij 0 gâG iâH0 j âJgi 0 Yt = ââ â Qgijt Pgijt gâG iâHt j âJgit for base and comparison periods, respectively. Average units volumes, Qg0 and Qgt, and average prices per unit, pg0 and pgt, are defined in the obvious way. Also, define the G à 1 vectors of average unit volumes and average prices per unit Q0 = (Q10 , Q20 , . . . . , QG 0 )â², Qt = (Q1t , Q2 t , . . . . , QGt )â², pt = ( p10 , p20 , . . . . , pG 0 )â², and pt = ( pit , p2 t , . . . . , pGt )â² . Then, the period-to-period trend in total dollar volume is given by Yt Ty = Y0 Nt Qt â² pt = N0 Q0â² p0 Nt  Qtâ² pt   Q0â² pt  =    N0 ï£ Q0â² pt  ï£ Q0â² p0  = TN TPq TLp , where TN is again the trend in the total HH count, TPq is the trend in average units per HH, and TLp is the trend in average price per unit. The trend in average units is weighted by comparison prices, and thus one might view TPq as a Paasche index of unit volume. Since the trend in average prices is weighted by base units volume, one might thus view TLp as a Laspeyres price index. An alternative decomposition of the trend is Nt  Qtâ² p0   Qtâ² pt  Ty =    N0 ï£ Q0â² p0  ï£ Qtâ² p0  = TN TLq Tp p , where TLq is a Laspeyres index of units volume and TPp is a Paasche price index. A second alternative decomposition of the trend is
286 APPENDIX Ty = TN (TLq Tpq ) 1/ 2 (Tp p TL ) 1/ 2 p = TN TFq TFp where TFq and TFp are Fisher indexes of unit volume and prices, respectively. Finally, we reach the real world in which both the sets of goods marketed, G0 and Gt, and the sets of households, H0 and Ht, vary by period. Partition the set of goods marketed at the base period by G0 = G0 Q ⪠G and partition the set of goods marketed at the comparison period by Gt = G ⪠GtE , where G0Q denotes exiting goods, GtE denotes entering goods, and G denotes both continuing and linkable goods. Continuing goods are marketed in both time periods, while exiting goods appear in the base period but not in the comparison, and entering goods appear in the comparison but not in the base. There is gray area we have called linkable goods. These are goods for which there is no exact match between the predeces- sor good and the successor good, but for which economic theory nevertheless accepts the link for purposes of index number construction. BLS has some link- age rules or criteria which it uses currently in producing the monthly CPI. Period-to-period trend in total dollar volume is now Yt Ty = Y0 â Nt Qgt pgt g âGt = â N0Qg0 pg0 g âG0 â Nt Qgt pgt g âGt R0 = , â 0 g0 g0 Rt N Q p g âG0 where â N0 Qg 0 pg 0 gâG R0 = Y0
STATISTICAL DEFINITION AND ESTIMATION OF PRICE INDEXES 287 is the continuing and linkable volume as a proportion of the total base volume, and â Nt Qgt pgt gâG Rt = Yt is the continuing and linkable volume as a proportion of the total comparison volume. Let Rt TR = R0 be the trend in the proportion continuing or linkable. Then building on the above, the trend in total dollar volume can be decomposed as Ty = TN TPq TLp TRâ1 = TN TLq TPp TRâ1 = TN TFq TFp TRâ1 . Alternative price indexes based on continuing and linkable goods are given byTLp is Laspeyres price index, TPp is the Paasche price index, and TFp is the Fisher price index. All price indexes discussed here extend naturally to a time series of comparison periods. In the first two formulations, we faced a simple world with only one good. In this world, the price index pt Tp = p0 is both plutocratic and democratic. In the third formulation, we faced a limited world in which a static set of goods is available in the marketplace. In this world, the plutocratic Laspeyres price index can be rewritten as pgt TLp = âp Sg 0 , gâG g0 where the plutocratic weight applied to the simple trend in average price â â Qgij 0 pgij 0 iâH0 j âJgi 0 Sg 0 = ââ â Qgâ²ij 0 Pgâ²ij 0 iâH0 gâ²âG j âJgâ²i 0
288 APPENDIX is simply market share expressed in dollars calculated across all HHs in the base period population with respect to the total market basket, G. Given the same assumptions, the democratic Laspeyres price index is defined by pgt I Lp = âp Dg+ 0 , gâG g0 where the democratic weight applied to the simple trend in average price â Qgij 0 pgij 0 1 j âJgi 0 Dg+ 0 = â N0 iâH0 â â Qgâ²ij 0 pgâ²ij 0 gâ²âG j âJgio is the unweighted population mean across all HHs in the base period population of the HH specific market shares. Thus, plutocratic weights are ratios of means and democratic weights are means of ratios. Similar weighting yields IPp, a demo- cratic Paasche price index, and IFp, a democratic Fisher price index. It is straight- forward to establish the following relationship between plutocratic and demo- cratic weights: Sg 0 = Dg + 0 + { Cov Dgi 0 , Y+ i 0 } Y+0  Cov Dgi 0 , Y+ i 0 = Dg + 0 + 1 + { }  ,  Dg + 0 Y++ 0  ï£ ï£¸ where â Qgij 0 pgij 0 j âJ gi 0 Dgi 0 = â â Qgâ²ij 0 pgâ²ij 0 g â²âG j âJ gâ²i 0 is market share within the ith household, Y+ i 0 = â â Qgâ²ij 0 pgâ²ij 0 g â²âG j âJ gâ²i 0 is total consumption volume by the ith HH in the base period, and 1 Y++ 0 = N0 â Y+ i 0 iâH0 is the population mean per HH of total consumption volume. Thus, the pluto- cratic weight exceeds (is exceeded by) the democratic weight for any good that displays a positive (negative) correlation between total HH consumption and HH market share. The weights are equal in the event of zero correlation. For example,
STATISTICAL DEFINITION AND ESTIMATION OF PRICE INDEXES 289 let g = automobiles. If there is positive correlation between total HH consumption and the share of HH consumption on automobiles, the plutocratic weight will exceed the democratic weight. Across all goods, one can now conclude the following relationship between price indexes: TLp = I Lp + â { pgt Cov Dgi 0 , Y+ i 0 . } gâG pg 0 Y++ 0 The difference between the price indexes is determined by the pattern of covari- ances and price trends across goods. If goods for which the covariance is positive experience relatively large increases in average price, plutocratic price indexes may exceed their democratic counterparts. In general, however, the direction of the difference between the price indexes is far from certain for a given compari- son period, t, let alone across periods. This matter is ripe for empirical investiga- tion. The democratic price indexes, and the relationship between plutocratic and democratic price indexes just discussed, extend naturally to the real-world situa- tion described above where the domain of goods varies by period. There are at least two approaches to estimating the price indexes: household (HH) survey data and store survey data. In this section, we explore the first approach; in the next section we look at the second. Let s0 and st denote probability samples drawn from the universe of HHs at times 0 and t, respectively. At each time period, assume that BLS collects unit volume and prices for all buying occasions for all goods from each HH, i, in the sample. Comprehensive data of this kind are not currently collected by any BLS survey. It might be feasibleâusing scanning technology or other approachesâto design surveys to collect such data. Let Wit and Wi0 denote survey weights such that QË = W gt â Q it â gijt iâst j âJgit and QË g 0 = â Wi 0 â Qgij 0 iâs0 j âJgi 0 are essentially design-unbiased estimators of the totals Qgt and Qg0, respectively. Similarly, define the estimated totals NË t = â Wit , iâst NË 0 = â Wi0 , iâs0
290 APPENDIX YËg + t = â Wit â Qgijt pgijt , i âst j âJ git YËg + 0 = â Wi 0 â Qgij 0 pgij 0 , i âs0 j âJ gi 0 YËt = â YËg +t , g âGt YË0 = â YËg + 0 , g âG0 YËtG = â YËg +t , g âG and YË0G = â YËg+0 . gâG The latter two estimators reflect total dollar volume across all continuing and linkable goods. From these basic estimated totals, one can consistently estimate the ratios YË G RË t = t , YË t YË G RË 0 = 0 , YË 0 Ë YËg + t Pgt = , QË g+t Ë YËg + 0 Pg 0 = , QË g+0 YËg + 0 SËg 0 = , YË 0 and â Wi 0 Dgi 0 Ë iâs0 . Dg+ 0 = NË 0
STATISTICAL DEFINITION AND ESTIMATION OF PRICE INDEXES 291 One can estimate the price indexes: QË g 0 Ë â Ë pgt pË gt Ë gâG N TËLp = 0 QË g 0 Ë = â pË Sg 0 , gâG â g0 pg 0 gâG N Ë 0 and pË gt Ë IËLp = â pË Dg+ 0 . gâG g0 Estimators for the other trends and indexesâ TËy, TËN, TËPp, TËFp, TËLq, TËPq, and TËFqâare defined in the obvious way. Next, consider the possibility of estimating the price indexes exclusively using store-level data. Let s0 and st denote probability samples of stores, let the subscript k index the store, and let Wk0 and Wkt denote survey weights corre- sponding to the unbiased estimator of a population total at times 0 and t, respec- tively. It is easy to imagine estimators YËgt = â Wkt Ygkt k âst YËg 0 = â Wk 0Ygk 0 k âs0 of total dollar volume and QË gt = â Wkt Ygkt k âst QË g 0 = â Wk 0Ygk 0 k âs0 of total unit volume. These estimators obviously require data on prices and unit volume by good at the store level. Current BLS surveys do not collect such data, but surveys based upon scanning technology could produce these data, at least for a subset of goods in a subdomain of stores. Given YËgt, YËg0, QË gt, QË g0, NË t, and NË 0, it is possible to estimate the plutocratic price indexes. The question is whether the price indexes estimated on the basis of store data really estimate TLp, TPp, and TFp. One would anticipate some biases due to such factors as ⢠goods purchased from stores for business use, not for home consumption;
292 APPENDIX ⢠shrinkage due to breakage and pilferage (this component of bias would depend on the mode of data collection); and ⢠coverage errors in the store sampling frame (i.e., missing stores in which consumers shop and including stores in which they do not). Regrettably, it is not possible to estimate the democratic index ILp exclusively from store-level data, at least not without additional assumptions. The democratic weights, Dg+0, are population means per HH, and HH data are necessary to estimate the means unbiasedly; such data are not usually available from stores (some store chains have adopted ID card programs that allow tracking of pur- chases by consumer). It may be possible to approximate the democratic index from store-level data with periodic adjustment of the weights. This possibility exploits the relationship between plutocratic and democratic weights set forth above. From store-level data, one can construct an estimator of the plutocratic weights YËg+ 0 SËg 0 = . YË0 Then we define the estimator of the democratic weights as Ë c( Dgi 0,Y+ i 0 ) Dg+ 0 = SËg 0 â , Ë Y++ 0 where the adjustment factor is the second term on the right side, developed from an independent HH survey, such as the Consumer Expenditure Survey. In this factor, c(Dgi0,Y+i0) is an estimator of the covariance between HH share and total Ë HH consumption, and Y ++0 is an estimator of mean total consumption per HH in the base period. It does not seem necessary to estimate the adjustment factor for each time period (month) the price index is produced. Perhaps it might be accept- able to maintain the adjustment factor only on an infrequent basis. Without question, one can imagine other hybrid schemes for estimating plu- tocratic or democratic price indexes. BLSâs current method is an outstanding example, with quantity weights coming from one survey and monthly prices from another.