Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Appendix A SIPP Data Quality John L. Czajka T his appendix provides brief summaries of what is known about the quality of data from the Survey of Income and Program Participation (SIPP) in areas that are central to the surveyâs principal purposes or major uses. Topics include the following: â¢ Income â¢ Program Participation â¢ Income Receipt from Multiple Sources â¢ Wealth â¢ Health Insurance Coverage Transitions â¢ Attrition â¢ Representation of the Population Over Time â¢ Seam Bias â¢ Imputation â¢ Wave 1 Bias These topics are discussed in the order that they are listed. INCOME In comparison with the Annual Social and Economic Supplement (ASEC) to the Current Population Survey (CPS), the official source of income and poverty statistics for the United States, SIPP captures nearly as much transfer income and substantially more self-employment income but 127
128 REENGINEERING THE SURVEY less wage and salary income and substantially less property income. These last two sources dominate earned and unearned income, respectively; as a result, SIPP underestimates total CPS income by 11 percent according to a recent comparison based on calendar year 2002 (Czajka and Denmead, 2008). This underestimation reflects a deterioration in the relative quality of SIPP income data since the surveyâs inception. Early SIPP Comparisons of income estimates from the first SIPP panel with the CPS and independent benchmarks were quite favorable to SIPP. In its estimate of aggregate income for calendar year 1984, SIPP captured 99.9 percent as much regular money incomeâthat is, excluding lump sumsâas the CPS (Vaughan, 1993). SIPP captured nearly 12 percent more transfer incomeâa major focus of the surveyâand 3 percent more property income. RelaÂ tive to independent estimates from program administrative data, SIPP captured 101 percent of aggregate Social Security income, 98 percent of S Â upplemental Security Income (SSI), 82 percent of Aid to Families with Dependent Children benefits, 96 percent of general assistance benefits, 77 percent of veteransâ compensation or pension income, and 87 percent of unemployment compensation. SIPP estimates of aggregate pension dollars by type were between 95 and 103 percent of independent estimates. How- ever, SIPPâs estimate of total earnings, the largest component of total income by far, was 1.8 percentage points below the CPS. Furthermore, SIPPâs shortfall on earned income was the net result of differential performance for wage and salary employment and self-employment. SIPPâs estimate of self-Âemployment income exceeded the CPS estimate by 45 percent, but for wage and salary income SIPP captured 5.3 percent fewer total dollars than the CPS. Relative to an independent estimate from the national income and product accounts (NIPAs), the CPS captured 98 percent of total wage and salary income and SIPP captured 92.6 percent. SIPPâs success with self-employment income was the result of a nonÂ conventional measurement approach that rejected the traditional defini- tion of such income as revenue less expenses (or profit/loss). The SIPP approach grew out of efforts to translate the conventional approach to a subannual reference period, during which revenues and expenses might fluctuate widelyâif they were known at all. SIPP staff sought a better measure of the income that business owners obtained from their businesses on a month-to-month basis. Rather than asking about profit and loss, SIPP asks respondents how much they withdrew from each business during each month of the reference period. One consequence of this approach is that
APPENDIX A 129 self-employment income cannot be negative in SIPP. In the CPS in the mid-1980s, roughly a fifth of the self-employed reported net losses from their businesses. With respect to wage and salary income, SIPPâs shortfall occurred despite the surveyâs finding 1.3 percent more workers than the CPS. The composition of the workers identified by SIPP may have contributed to the difference in aggregate dollars. Compared with the CPS, SIPP found 13 percent more workers who were employed less than full-time, full-year, but 7 percent fewer full-time, full-year workers. SIPPâs success in find- ing part-time and part-year workers seemed to be a direct result of the surveyâs more frequent interviews and shorter reference periods relative to the annual interviews and annual reference period of the CPS. The smaller number of full-time, full-year workers in SIPP could also have reflected a more accurate reporting of hours and weeks worked. If that were the case, however, the lower aggregate income obtained in SIPP would have been due entirely to workers reporting lower income from their employment than workers responding to the CPS. SIPP Income Over Time Between 1984 and 1990, the SIPP estimate of total income slipped below 98 percent of the CPS aggregate according to analyses reported by Coder and Scoon-Rogers (1996) and Roemer (2000). This reduction was distributed across a large number of income sources, with no single source or small number of sources being primarily responsible for the change. â the 2004 panel, SIPP started to ask separately for the amount of profit or loss over the In 4-month reference period and to include this amount in monthly income totals. Net negative income from self-employmentânot previously provided in the SIPP public-use filesâwill now be provided. âTo estimate aggregate annual income with SIPP, one must sum the monthly amounts reported by respondents who may not have been presentâin the sample or even in the popu- lationâfor the entire calendar year. There are different ways to do this, and they vary with respect to which months are counted for which persons and what weights are applied to them. Coder and Scoon-Rogers (1996) describe three methods and provide SIPP estimates for all three. None of the three methods is inherently more valid than the others; they just represent different ways of looking at the income data collected by SIPP, although two of the methods are more consistent with the way that SIPP collects income data. The third method, which is designed to resemble the CPS, requires an adjustment for missing months. The first method, which sums the monthly aggregates for all respondents present each month, makes the fullest use of the income data reported for a calendar year, but it yields slightly lower annual income estimates than the other two methods for 1990. Coder and Scoon-Rogers used the third method, and Vaughan (1993) and Roemer (2000) used the first method.
130 REENGINEERING THE SURVEY Between 1990 and 1996, a period that saw the introduction of Âcomputer- assisted interviewing to both surveys and a major redesign of SIPP, Roemerâs (2000) detailed analysis shows that SIPP fell further behind the CPS. Rela- tive to independent NIPA benchmarks, SIPP estimates of total income dropped only slightly, from 87.1 percent in 1990 to 85.7 percent in 1996 (see Table A-1). More substantial reductions were recorded for property income (9 percentage points) and transfers (6 percentage points). Estimates of pension income increased by a percentage point, as did wages and salaries, but income from self-employment fell from 85 percent of the benchmark in 1990 to 69 percent by 1996. Given that the SIPP concept of self-employment income differs from the conventional concept, the decline should probably be attributed to a growing gap between the two concepts rather than anything in the survey. Finally, it may reflect favorably on some aspects of the SIPP redesign that the estimate of SIPP total income relative to the benchmark rose by a percentage point between 1995 and 1996 after Â having declined from 87.1 percent in 1990 to 84.8 percent in 1994 and 1995. Over the same period, however, CPS total income increased by 3 per- centage points relative to the benchmark (see Table A-2). CPS estimates of wages and salaries increased from 95.9 to 101.9 percent of the NIPA estimate; property income rose from 62.8 to 70.0 percent; and transfer income increased from 87.6 to 88.3 percent. Pensions declined, however, from 88.9 to 76.6 percent, and self-employment income dropped from 68.5 to 52.6 percent. The biggest increase occurred between the 1992 and 1993 reference years, which coincided with the introduction of computer-assisted interviewing in the CPS. One element of the switch from a paper and pencil instrument was clearly related to the increased amount of income collected: the maximum amount of wage and salary income that could be reported was increased from $499,997 to $2,099,999. Roemer determined that this change alone added 2 percentage points to the CPS income total relative to the NIPA total. The combined impact of the SIPP and CPS changes over this period was to reduce the ratio of SIPP to CPS total income to 92.5 percent (see Table A-3). Wages and salaries in SIPP dropped from 94.0 to 89.3 percent of the CPS estimate, although self-employment income increased by 7 per- centage points relative to the CPS. SIPP property income fell substantially, going from 104.0 to 80.9 percent of the CPS. Even transfer income dropped from 105.0 to 97.7 percent of the CPS estimate, but this could be attributed primarily to Social Security income, which fell from 105.6 to 98.4 percent of the CPS estimate between 1993 and 1994. The shift between those two years was owing to an increase in the amount reported in the CPS rather â There does not appear to have been a similar issue with respect to the collection of wage and salary income in SIPP, given that annual earnings are constructed from monthly earnings.
TABLE A-1â Survey Income as a Percentage of Independent (NIPA) Benchmarks: SIPP, 1990 to 1996 Â Â Survey Reference Year Income Source 1990 Â 1991 Â 1992 Â 1993 Â 1994 Â 1995 Â 1996 Total Income 87.1 87.9 84.9 86.9 84.8 84.8 85.7 Earnings 89.6 90.9 86.9 87.4 86.4 86.7 88.4 Wages and salaries 90.1 90.5 88.1 89.0 88.5 88.3 91.0 Self-employment 85.1 94.6 77.7 76.2 70.5 75.0 69.1 Property Income 65.3 60.2 60.5 77.0 60.1 58.9 56.6 Interest 56.7 56.6 56.5 62.1 51.3 51.3 50.2 Dividends 65.8 53.3 50.5 95.9 62.5 65.8 51.0 Rent and royalties 113.1 90.7 90.8 91.2 81.0 69.2 82.0 Transfers 92.0 90.5 89.0 89.4 87.8 87.0 86.3 Social Security and Railroad Retirement 97.1 95.0 93.6 92.7 90.8 90.9 87.9 Supplemental Security Income 83.1 88.6 84.9 82.9 86.0 86.2 101.4 Family assistance 75.6 76.4 69.9 89.1 87.3 85.8 76.3 Other cash welfare 81.9 100.9 81.3 96.6 79.2 95.9 114.0 Unemployment compensation 77.5 83.5 82.4 86.3 84.3 75.7 69.4 Workersâ compensation 67.8 61.5 68.6 59.2 57.8 51.2 71.7 Veteransâ payments 83.1 78.8 79.5 77.5 75.6 72.7 72.9 Pensions 84.6 87.9 84.9 86.9 84.8 84.8 85.7 Private pensions 91.8 85.7 86.7 96.9 103.8 99.5 98.1 Federal employee pensions 75.9 89.8 84.6 86.3 89.0 88.5 75.6 Military retirement 87.4 92.0 83.4 87.3 87.1 85.4 101.6 State and local employee pensions 76.8 84.2 80.1 76.6 77.0 74.3 67.8 NOTE: Survey estimates are based on the Census Bureauâs internal data, without top-coding; however, there are limits on the amount of income that can be reported, which vary by source. SOURCE: Roemer (2000:Table 3b); data from the 1990, 1991, 1993, and 1996 SIPP panels. 131
TABLE A-2â Survey Income as a Percentage of Independent (NIPA) Benchmarks: March CPS, 1990 to 1996 132 Â Â Survey Reference Year Income Source 1990 Â 1991 Â 1992 Â 1993 Â 1994 Â 1995 Â 1996 Total Income 89.3 89.4 88.0 91.7 92.9 92.2 92.6 Earnings 93.0 93.0 91.3 94.8 96.4 95.1 96.1 Wages and salaries 95.9 96.4 95.6 99.7 101.9 101.4 101.9 Self-employment 68.5 65.3 58.6 58.9 54.8 48.5 52.6 Property Income 62.8 63.6 63.2 69.8 65.7 72.9 70.0 Interest 67.1 68.3 67.6 79.7 72.3 83.9 83.8 Dividends 40.9 45.7 49.2 54.3 54.6 62.6 59.4 Rent and royalties 85.0 74.1 69.8 65.2 64.8 58.7 58.6 Transfers 87.6 86.8 83.6 85.6 89.5 89.2 88.3 Social Security and Railroad Retirement 90.6 88.6 87.1 87.8 92.3 92.0 91.7 Supplemental Security Income 78.9 84.6 75.5 84.2 78.0 77.1 84.2 Family assistance 74.4 74.4 72.2 76.4 73.1 70.5 67.7 Other cash welfare 85.6 77.5 81.6 101.3 105.2 95.8 80.5 Unemployment compensation 79.9 82.5 72.8 77.6 90.0 91.3 81.6 Workersâ compensation 89.5 89.1 82.5 77.0 77.7 69.3 62.7 Veteransâ payments 73.9 82.9 77.7 85.5 84.7 94.9 89.6 Pensions 88.9 85.5 83.1 83.6 83.1 78.2 76.6 Private pensions 98.3 96.3 96.4 98.8 102.7 93.9 93.1 Federal employee pensions 82.7 82.6 84.5 82.7 80.9 77.9 80.8 Military retirement 85.6 84.6 74.3 71.7 76.4 70.6 58.2 State and local employee pensions 78.7 68.5 64.2 66.7 59.6 59.0 57.3 SOURCE: Roemer (2000:Table 2b); data from the 1991 through 1997 ASEC supplements to the CPS.
TABLE A-3â SIPP Aggregate Income as a Percentage of March CPS Aggregate Income, 1990 to 1996 Â Â Survey Reference Year Income Source 1990 Â 1991 Â 1992 Â 1993 Â 1994 Â 1995 Â 1996 Total Income 97.5 98.3 96.5 94.8 91.3 92.0 92.5 Earnings 96.3 97.7 95.2 92.2 89.6 91.2 92.0 Wages and salaries 94.0 93.9 92.2 89.3 86.8 87.1 89.3 Self-employment 124.2 144.9 132.6 129.4 128.6 154.6 131.4 Property Income 104.0 94.7 95.7 110.3 91.5 80.8 80.9 Interest 84.5 82.9 83.6 77.9 71.0 61.1 59.9 Dividends 160.9 116.6 102.6 176.6 114.5 105.1 85.9 Rent and royalties 133.1 122.4 130.1 139.9 125.0 117.9 139.9 Transfers 105.0 104.3 106.5 104.4 98.1 97.5 97.7 Social Security and Railroad Retirement 107.2 107.2 107.5 105.6 98.4 98.8 95.9 Supplemental Security Income 105.3 104.7 112.5 98.5 110.3 111.8 120.4 Family assistance 101.6 102.7 96.8 116.6 119.4 121.7 112.7 Other cash welfare 95.7 130.2 99.6 95.4 75.3 100.1 141.6 Unemployment compensation 97.0 101.2 113.2 111.2 93.7 82.9 85.0 Workersâ compensation 75.8 69.0 83.2 76.9 74.4 73.9 114.4 Veteransâ payments 112.4 95.1 102.3 90.6 89.3 76.6 81.4 Pensions 95.2 102.8 102.2 103.9 102.0 108.4 111.9 Private pensions 93.4 89.0 89.9 98.1 101.1 106.0 105.4 Federal employee pensions 91.8 108.7 100.1 104.4 110.0 113.6 93.6 Military retirement 102.1 108.7 112.2 121.8 114.0 121.0 174.6 State and local employee pensions 97.6 122.9 124.8 114.8 129.2 125.9 118.3 SOURCE: Tables A-1 and A-2. 133
134 REENGINEERING THE SURVEY than a decline in what was reported in the SIPP. However, later analyses of SIPP data matched to Social Security administrative records uncovered a tendency for respondents to report their Social Security payments net of their Medicare Part B premiums, which are deducted from their monthly benefit checks or automated payments (Huynh, Rupp, and Sears, 2001). In an apparent concession to respondents, the SIPP instrument was changed after the first wave of the 1993 panel to explicitly request that Social S Â ecurity benefits be reported net of the Medicare premiums. The SIPP instrument was revised again for the 2004 panel to collect the amount of the Medicare premium as a separate quantity, which the Census Bureau could then add to the reported net payment to obtain the gross amount. Finally, SIPP pension income increased from 95.2 to 111.9 percent of the CPS estimate due to the decline in pension dollars collected in the CPS. Quality of Wage and Salary Data To gain a better understanding of the biggest source of the discrepancy between SIPP and CPS total income, Roemer (2002) compared both SIPP and CPS annual wages and salaries to the wages and salaries reported in the Social Security Administrationâs Detailed Earnings Records (DER) for 1990, 1993, and 1996. Unlike other Social Security wage records, the DER is not capped at the income subject to the Social Security tax, and unlike tax records it includes deferred compensation. Roemerâs comparisons used sur- vey records that had been matched to the DER based on the Social Security numbers reported by SIPP and CPS respondents, allowing an assessment of discrepancies between the survey and administrative records at the micro level. Key findings from Roemerâs analysis include â¢ Distributions of DER wages for the two surveys were very similar, implying that differential sample selection bias was not a factor in SIPPâs lower wage and salary income. â¢ Compared with the distribution of wages in the DER, SIPP had too many individuals with amounts below $30,000 and too few with amounts above $35,000; above $175,000, SIPP had only one-third to one-half as many earners as the DER. â¢ For 1996, the CPS had too few individuals with wages below $10,000, too many between $15,000 and $100,000, slightly too few between $100,000 and $200,000, and too many above $300,000. â The increased reporting of Social Security benefits in the CPS lagged by a year the introduc- tion of computer-assisted interviewing; nevertheless, the sudden stepped-up reporting suggests an instrument change.
APPENDIX A 135 â¢ For sample members with both survey and DER wages, 57 percent of SIPP respondents and 49 percent of CPS respondents reported wages below their DER amounts; 3 percent of SIPP and 8 percent of CPS respondents reported wages equal to their DER amounts; and 40 percent of SIPP and 43 percent of CPS respondents reported wages above their DER amounts. â¢ The CPS appears to be superior to SIPP in capturing wages from the underground economy; in 1996, 3.6 percent of CPS wages and 1.8 percent of SIPP wages were reported by persons with no DER wages and no indication of self-employment; for the CPS this frac- tion grew from 2.5 percent in 1993. â¢ The CPS also appears to pick up more self-employment income misclassified as wages; in 1996, 3.0 percent of CPS wages and 1.5 percent of SIPP wages were reported by persons with no DER wages but with DER self-employment income; for the CPS this fraction grew from 2.2 percent in 1993. â¢ Both types of non-DER wages (underground wages and misÂclassified self-employment) occur at all income levels in both surveys, but the CPS has far more persons than SIPP with non-DER wages at upper income levels. Thus, most of the difference between the SIPP and CPS wage and salary aggregates appears to be due to underreporting of legitimate wage income in SIPP, with misclassified self-employment income and the CPSâs greater reporting of underground income accounting for less than a third of the gap between the two surveys. Speculation about possible reasons for SIPPâs underreporting of wage and salary income has focused on the possibility that the short reference period may lead SIPP respondents to report their take-home rather than gross pay despite the specificity of the questions. The short reference period, which is clearly helpful in capturing earnings from people with irregular employment, may also contribute to omissions of earned income. Roemer (2002) notes that when SIPP asked annual income questions at the end of each year, Coder (1988) found that the 12 months of reported wages for respondents with a single employer totaled nearly 7 percent less than what the same respondents reported in the annual round-up. Income by Quintile For most SIPP users, the quality of the income data in the lower end of the income distribution is far more important than its quality across the entire distribution. Furthermore, estimates of aggregate income for many sources are affected disproportionately by the amount of income captured
136 REENGINEERING THE SURVEY in the upper tail of the distribution, in which the income holdings for those sources are concentrated. SIPPâs superior capture of transfer income could reflect the surveyâs more complete capture of income in the lower end of the distribution generally. To show how SIPP and CPS income estimates compare in different parts of the income distribution, Table A-4 presents estimates of aggregate income, by source, for quintiles of the population based on total family income, prepared for the panel. Estimates are presented for 3 calendar years: 1993, 1997, and 2002. The SIPP estimates are from the 1992, 1996, and 2001 panels and, for consistency, are derived from the second year of data in each panel. The CPS estimates are from the 1994, 1998, and 2003 supplements. The CPS data for all 3 years were collected with a computer- assisted instrument, whereas the SIPP data for 1993 were collected with a paper and pencil instrument. SIPP data for 2002 were the latest full cal- endar year available at the time the estimates in Table A-4 were prepared. By including comparative estimates for 2002, one can determine if the CPS gains during the first half of the 1990s persisted or whether the second new panel following the SIPP redesign was able to reverse the earlier trend. Unlike Roemerâs estimates in Table A-1, the estimates in Table A-4 are based on public-use microdata files rather than the Census Bureauâs inter- nal files, and the 1993 SIPP estimates are from the second year of the 1992 panel rather than the first year of the 1993 panel. Also, the SIPP estimates in Table A-4 were calculated with the same method of aggregation used by Coder and Scoon-Rogers (1996), which differs from the method used by Roemer (2000) and Vaughan (1993). Differences between the percentages in the total column for 1993 and those reported in Table A-3 for compa- rable sources are due to any or all of these factors. Nevertheless, while there are differences by source, our estimate of SIPP aggregate income as a percentage of CPS aggregate income, at 94.5 percent, compares closely to Roemerâs estimate of 94.8 percent. The question of what happened to the ratio of SIPP to CPS income between 1997 and 2002 is answered by the estimates in the total column. While the ratio of SIPP to CPS total income declined from 94.5 to 89.0 per- cent between 1993 and 1997, the ratio rose slightly, to 89.4 percent, between 1997 and 2002. SIPP wages and salaries declined from 84.6 to 82.4 percent of the CPS aggregate, but this was offset by small improvements in every other source. On the whole, then, the relationships between income aggre- â The bottom or first quintile contains the 20 percent of persons with the lowest family incomes. The top or fifth quintile contains the 20 percent of persons with the highest family incomes. â The 1996 panel started 2 months late and did not collect data for all 12 months of 1996 for two of the four rotation groups.
APPENDIX A 137 gates in the two surveys appear to have stabilized following the movement that occurred with the introduction of computer-assisted interviewing in the CPS and the redesign of SIPP. If one excludes the top quintile in order to eliminate the impact of dif- ferential topcoding as well as the CPSâs seemingly more effective capture of very high incomes, one finds that the ratio of SIPP to CPS aggregate income increases by 4 to 6 percentage points in every year. SIPP wages and salaries and property income remain well below their CPS counterparts, but their shares of CPS income increase in all years. SIPP self-employment income remains well above the corresponding CPS amount, but the margin declines. For all other sources, the differences in their shares change little or in an inconsistent way when the top income quintile is excluded. Turning to the results by income quintile, one finds, first, that the ratio of SIPP to CPS total income declines progressively from the bottom to the top quintile and does so in every year. Second, in the bottom quintile but no other quintile, the SIPP estimate of aggregate income exceeds the CPS aggregate in every year. Third, also in the bottom quintile alone, the ratio of SIPP to CPS income declines by as much between 1997 and 2002 as it did between 1993 and 1997, dropping from 119.5 to 112.2 percent and then to 105.7 percent of the CPS aggregate. In other words, over a period of only 9 years, SIPP went from capturing 20 percent more income than the CPS in the bottom quintile to capturing only 6 percent more income than the CPS. The 20 percent more income in 1993 included 25 percent more wages and salaries, 157 percent more self-employment income, 22 percent more property income, 7 percent more Social Security and Railroad Retirement income, 12 percent more Supplemental Security Income (SSI), an equal amount of welfare income, 24 percent more income from other transfers, and 44 percent more pension income. By 2002, SIPP was capturing only 9 percent more wages and salaries, 129 percent more self-employment income, 5 percent more property income, 12 percent less Social Security and Railroad Retirement income, 27 percent more SSI (an increase), 20 percent more welfare income (also an increase), 31 percent less income from other transfers, and 98 percent more pension income. In the second income quintile, the SIPP captured 1.5 percent more aggre- gate income than the CPS in 1993, but this dropped to a 4 percent deficit by 1997. Unlike the first quintile, however, the SIPP held ground after that, gaining back a percentage point by 2002. The SIPP estimate of wages and salaries dropped from 100 percent of the CPS amount to 92 percent in 1997 but rose to 94 percent in 2002. Property income fell from 112 to 90 percent of the CPS amount, while Social Security and Railroad Retirement fell from 97 to 90 percent. Other transfers dropped from 90 to 59 percent of the CPS amount. Sizable improvements relative to the CPS were recorded for self- employment, SSI, welfare, and pensions, however.
TABLE A-4â SIPP Aggregate Income as a Percentage of CPS Aggregate Income by Source of Income and Family 138 Income Quintile: 1993, 1997, and 2002 Quintile of Family Income Bottom Income Source Â Bottom Â Second Â Third Â Fourth Â Top Â Total Â Four 1993 Calendar Year Total Income 119.5 101.5 95.9 93.0 88.9 94.5 98.2 Wages and salaries 124.8 100.1 92.6 88.7 81.8 88.9 94.0 Self-employment 257.4 133.3 128.8 139.3 178.0 160.3 139.0 Property income 121.7 112.4 101.5 99.0 67.3 83.5 104.2 Social Security and Railroad Retirement 107.2 97.4 98.6 105.7 107.4 102.2 101.8 Supplemental Security Income 111.6 77.8 75.6 89.5 91.5 98.2 98.5 Welfare 99.0 72.4 114.6 180.2 141.0 97.6 97.0 Other transfers 123.5 89.5 84.6 81.2 72.9 87.9 92.3 Pensions 143.6 116.9 110.6 115.9 107.3 113.9 116.3 1997 Calendar Year Total Income 112.2 96.0 93.8 91.8 80.7 89.0 95.3 Wages and salaries 111.7 91.7 89.7 87.9 77.2 84.6 90.6 Self-employment 214.7 152.2 153.0 160.5 169.3 165.6 159.5 Property income 112.6 90.6 74.7 64.2 32.4 49.0 75.9 Social Security and Railroad Retirement 100.0 89.3 97.4 99.1 75.5 93.6 95.7 Supplemental Security Income 122.0 126.1 121.7 166.7 149.0 126.5 125.7 Welfare 122.4 143.5 267.2 420.1 509.9 145.0 140.4 Other transfers 71.6 58.7 57.7 52.5 41.4 54.7 59.3 Pensions 229.0 150.5 136.9 142.9 102.6 134.5 148.6
2002 Calendar Year Total Income 105.7 97.3 92.8 90.7 83.0 89.4 94.2 Wages and salaries 109.4 93.8 86.0 85.3 75.2 82.4 88.6 Self-employment 219.5 162.4 160.5 158.1 208.6 188.3 163.2 Property income 104.7 90.0 70.7 56.0 38.6 52.3 70.5 Social Security and Railroad Retirement 87.6 90.4 104.4 115.9 94.5 95.4 95.4 Supplemental Security Income 126.6 121.3 139.4 152.6 222.9 131.5 128.5 Welfare 119.5 109.4 176.4 246.6 1,415.5 146.7 128.4 Other transfers 70.9 58.8 64.2 57.4 42.8 58.0 62.4 Pensions 198.4 140.2 155.7 153.5 128.3 147.0 153.9 SOURCE: 1992, 1996, and 2001 SIPP panels and 1994, 1998, and 2003 CPS ASEC supplements. 139
140 REENGINEERING THE SURVEY These basic patterns were repeated in the third and fourth quintiles, for which the SIPP estimates of aggregate income fell by 2 to 3 percentage points relative to the CPS, ending up at 93 percent in the third quintile and 91 percent in the fourth quintile. There was one notable exception to the patterns by income source; the capture of Social Security and Railroad Retirement in SIPP improved between 1997 and 2002 to the point at which SIPP captured relatively more of such income in comparison to the CPS in 2002 than in 1993. In both cases, the SIPP aggregates in 2002 exceeded the CPS aggregates. Elsewhere, SIPP fell further behind the CPS in wages and salaries, property income, and other transfers but improved in self- employment, SSI, welfare, and pensions. In both quintiles, SIPP captured 50 percent more pension income in 2002 than did the CPS. In the top quintile, the SIPP estimate of aggregate Social Security and Railroad Retirement dropped relative to the CPS, as it did in the first and second quintiles, and the relative gain in the capture of pension income was more modest than in the lower quintiles. Otherwise, the different sources improved or declined in the SIPP, just as they did in the lower quintiles. Over all sources, the SIPP estimate of total income in the top quintile dropped from 89 percent of the CPS estimate in 1993 to 81 percent in 1997 but then rose to 83 percent in 2002. On balance, then, while there was gradual erosion in the amount of income collected by SIPP relative to the CPS between 1990 and 1996, which was due largely to changes in the CPS, there was a much more substantial reduction in the relative amount of income that SIPP collected from the bottom quintile of the family income distribution. This change is significant because it detracts from what has been SIPPâs greatest strength, historically, in the collection of income data. PROGRAM PARTICIPATION SIPP was designed to do a better job of capturing program participa- tion and benefit amounts than other surveys, and from the beginning it has generally done so. SIPP still falls short of administrative totals for most programs, but in some cases SIPP estimates exceed the program totals. A recent review of survey reporting of program participation and bene- fit receipt relative to administrative totals concludes that SIPP âtypically has the highest reporting rate for government transfers, followed by the CPSâ and the Panel Study of Income Dynamics (PSID), but that some Âprogramsâ specifically unemployment insurance and workersâ Â compensationâare reported more fully in the CPS than in SIPP (Meyer, Mok, and Sullivan, 2009). The study also finds the highest overall dollar reporting in SIPP and the American Community Survey (ACS). These are followed, in turn, by the CPS, PSID, and the Consumer Expenditure (CE) Survey. One other
APPENDIX A 141 conclusion of note is that while the reporting of most programs in the PSID, CPS, and CE experienced a significant decline over time, the decline was less pronounced in SIPP and the ACS actually showed improvement. Table A-5 presents shares of administrative totals of average monthly participants and aggregate annual benefits estimated by SIPP and the CPS in 1987, 1996, and 2005 for the following programs: â¢ Food Stamp Program (FSP) â¢ Aid to Families with Dependent Children (AFDC)/Temporary Assistance for Needy Families (TANF) â¢ Old-Age and Survivors Insurance (OASI) â¢ Social Security Disability Insurance (SSDI) â¢ Supplemental Security Income (SSI) TABLE A-5â SIPP and CPS Estimates of Program Participants and Aggregate Benefits as a Percentage of Administrative Benchmarks, Selected Years 1987 1996 2005 Program Â SIPP Â CPS Â SIPP Â CPS Â SIPP Â CPS Survey Estimate of Average Monthly Participants as a Percentage of Administrative Benchmark FSP 88.1 73.2 84.2 66.3 82.9 56.5 AFDC/TANF 76.4 80.5 79.5 67.0 80.9 63.0 OASI 94.5 88.0 94.2 84.3 97.1 82.9 SSDI 101.3 101.6 90.6 89.7 93.8 78.8 SSI 90.3 72.7 94.4 65.8 102.7 58.0 NSLP 113.7 54.4 111.6 64.2 112.5 48.4 WIC 66.0 56.1 58.2 60.1 Survey Estimate of Aggregate Benefits as a Percentage of Administrative Benchmark FSP 85.9 74.2 79.0 63.1 76.4 54.6 AFDC/TANF 73.0 74.4 77.0 66.8 62.2 48.7 OASI 95.0 89.0 88.7 90.5 97.4 89.7 SSDI 95.4 100.1 79.3 91.9 84.8 81.8 SSI 89.6 76.7 93.4 77.6 110.0 79.4 NOTE: FSP = Food Stamp Program; AFDC/TANF = Aid to Families with Dependent Children/Â Temporary Assistance for Needy Families; OASI = Old-Age and Survivors Insurance (Social Security); SSDI = Social Security Disability Insurance; SSI = Supplemental Security Income; NSLP = National School Lunch Program; WIC = Special Supplemental Nutrition Program for Women, Infants and Children. SOURCE: Adapted from Meyer, Mok, and Sullivan (2009).
142 REENGINEERING THE SURVEY â¢ National School Lunch Program (NSLP) â¢ Special Supplemental Nutrition Program for Women, Infants and Children (WIC) For every program but WIC, SIPP captures a substantially higher per- centage of average monthly participants than the CPS in 2005, but was no better than the CPS for AFDC in 1987 and SSDI in both 1987 and 1996. Benefits, which are reported about as well as participants for some programs but less well for others, tell a similar story. In 2005, SIPP cap- tured a higher share of aggregate annual benefits than the CPS for the FSP, AFDC/TANF, OASI, and SSI, but was only marginally better for SSDI. In 1987, SIPP was on a par with the CPS for AFDC/TANF and SSDI, which is consistent with the surveysâ relative estimates of participants in these two programs in that year. Whether because of poor recall or because respondents sometimes answer on the basis of their current situation, CPS estimates of persons who ever participated in a program sometimes line up with SIPP estimates of average monthly participants. Medicaid provides a good example. SIPPâs estimate of Medicaid participants in a given month is comparable to the CPSâs estimate of persons who were ever covered by Medicaid during the previous year. For instance, SIPP estimated that 11.8 percent of the popu- lation was covered by Medicaid in December 2002, whereas the CPS esti- mated that 11.6 percent of the population was ever enrolled in Medicaid in the 2002 calendar year (Czajka and Denmead, 2008). SIPPâs estimate of persons ever enrolled in Medicaid was 17.1 percent of the populationâ s Â ubstantially larger than the CPS estimate. INCOME RECEIPT FROM MULTIPLE SOURCES One of the early uses of SIPP data was to support estimates of multiple benefit receipt by participants in transfer programs and to help determine how often the receipt of benefits from more than one program was serial versus simultaneous (see, for example, Doyle and Long, 1988). High- q Â uality monthly data were critical to such research, and SIPP has remained unique in its ability to support the production of such estimates (see, more recently, Reese, 2007). A related area of research involves determining the extent to which the beneficiaries of a given program are dependent on that program for their economic support. For example, the fraction of total income that retired persons derive from Social Security benefits is highly relevant to policy debates involving how to ensure the continued financial solvency of the Social Security system. Estimates from the CPS for calendar year 2001 show that 22 percent of retired workers relied on their Social Security pay-
APPENDIX A 143 ments for 100 percent of their income (Czajka, Mabli, and Cody, 2008). The Âcorresponding figure from SIPP was only 8 percent. Similarly, the CPS finds 18 percent of retired workers receiving less than 25 percent of their income from Social Security payments, and SIPP finds nearly twice that share, or 30 percent. Retired workers in SIPP were more likely than their counterparts in the CPS to report receiving each of six additional sources: wages, self-employment, property income, pensions, SSI, and welfare. These differences highlight the impact of SIPPâs short reference period and the surveyâs focus on income and benefit recipiency. WEALTH One of the key limitations of the CPS for modeling program eligibility is the absence of data on assets and liabilities. Many programs include in their eligibility criteria some limitations on asset holdings, and some pro- grams have very explicit provisions about particular types of assets, such as vehicles. From its inception, SIPP has collected data on asset holdings and liabilities, and its focus on the types of assets and debts held by low-income families has been an important feature of the SIPP wealth data. The standard against which all survey data on wealth are measured is the Survey of Consumer Finances (SCF), conducted by the Federal Reserve Board. In addition to hundreds of questions on detailed components of assets and liabilities, the SCF includes a high-income subsample drawn from tax records. Wealth is even more heavily concentrated than income, with about a third of all wealth held by the wealthiest 1 percent of families and two-thirds held by the wealthiest 10 percent, leaving one-third for the remaining 90 percent of the population (Kennickell, 2006). Accurate measurement of aggregate wealth holdings requires a sample design that reflects this distribution. Wolff (1999) compared SIPP, the SCF, and the PSID with respect to a number of measures of the size and distribution of wealth over the mid- 1980s through the mid-1990s. His findings suggest that, for the lowest two income quintiles, SIPP did as well as the SCF in capturing asset holdings. Furthermore, SIPPâs comparative performance did not deteriorate a great âFisher (2007) reports that for 1996 the survey estimates of full reliance on Social Security benefits were 17.9 percent for the CPS and 8.5 percent for SIPP, so the difference between the two surveys appears to have grown between 1996 and 2001. The 1996 difference was reduced only slightly when matched administrative records were substituted for survey data and used to assign beneficiary status. âEarlier, we documented that underreporting of SSI and AFDC/TANF is greater in the CPS than in SIPP. Matches to program administrative records show that while Social Security beneficiaries in both surveys nearly always report receiving benefits, SSI beneficiaries are sig- nificantly less likely to report receiving SSI in the CPS than in SIPP (Koenig, 2003).
144 REENGINEERING THE SURVEY deal through the next two quintilesâthat is, through the lower 80 percent of the income distribution. SIPP also did particularly well in capturing the major types of wealth held by the middle class, such as homes, vehicles, and savings bonds, but it did not do as well in capturing the types of assets held by the wealthiest families. A comparison of the three surveysâ estimates of wealth in late 1998 and early 1999 showed that SIPPâs estimate of aggregate net worth, defined as assets minus liabilities, of $14.4 trillion was just under half of the SCF estimate of $29.1 trillion and 60 percent of the PSID estimate (ÂCzajka, J Â acobson, and Cody, 2003). The SIPP estimate of median net worth, $48,000, was two-thirds of the SCF median of $71,800 and 74 percent of the PSID median. SIPP is much more effective in capturing liabilities than assets. SIPPâs estimate of aggregate assets was 55 percent of the SCF estimate of $34.1 trillion, but its estimate of aggregate liabilities was 90 percent of the SCF estimate of $5.0 trillion. SIPPâs estimate of median assets was 83 per- cent of the SCF median of $116,500, while its estimate of median liabilities was 97 percent of the SCF median of $11,900. SIPPâs weaker performance in measuring net worth than either assets or liabilities reflects the imbalance in the surveyâs estimates of these two components. By estimating the nega- tive side of the balance sheet more fully than the positive side, SIPP adds to its underestimate of net worth. As a proportion of the corresponding SCF estimate, SIPPâs estimates of aggregate assets exhibit wide variation by type. SIPPâs estimate of the value of the home was 91 percent of the SCF estimate, but SIPP captured only 41 percent of the SCF valuation of other real estate. SIPP captured 76 percent of the SCF estimate of motor vehicles but only 17 percent of SCF business equity. Among financial assets, SIPPâs estimate of 401(k) and thrift accounts was 99 percent of the SCF estimate, but the next best component, other financial assets, was only 71 percent of the SCF estimate. For assets held at financial institutions, the SIPP estimate was 63 percent of the SCF estimate. For stocks and mutual funds, the largest financial asset, the SIPP estimate was only 59 percent of the SCF estimate, whereas the SIPP estimate of IRA and Keogh accounts was 55 percent of the SCF estimate. Finally, the SIPP estimate of other interest earning assets was only 33 percent of the SCF amount. In contrast to Wolffâs findings from the mid-1980s and early 1990s that SIPP matched the SCF in capturing the asset holdings of the bottom two income quintiles, the estimates from the 1996 panel showed that SIPP did not fare appreciably better with the assets of low-income families than with higher income families. Between the early and late 1990s, SIPP families with negative or zero net worth grew from 13 to 17 percent of the population while SCF families with no net worth remained at 13 percent.
APPENDIX A 145 A more telling sign of the reduction in the quality of the SIPP wealth data is that the correlation between assets and liabilities dropped from .49 in the 1992 and 1993 SIPP panels to between .06 and .19 in the 1996 panel. Over the same period the correlation in the SCF dropped only moderately, from .50 to .40. HEALTH INSURANCE COVERAGE TRANSITIONS Average monthly estimates of health insurance coverage from SIPP compare closely to estimates of health insurance coverage obtained in the National Health Interview Survey (NHIS), which measures coverage at the time of the interview and therefore is free of recall bias (Czajka and D Â enmead, 2008; Davern et al., 2007). Both sets of estimates are also rela- tively close to the estimate of persons with health insurance coverage from the CPS. The CPS estimate is intended to measure any amount of coverage over the prior calendar year and therefore ought to be higher than both the SIPP and NHIS estimates, but it clearly suffers from some combination of underreporting and a tendency for respondents to answer in terms of their current coverage. To its strong cross-sectional estimates of health insurance coverage, SIPP adds a longitudinal dimension, which enables the survey to provide information on changes in coverage over time and what coverage people ever had over an extended period of time. One of the objectives reflected in the design of SIPP is to obtain more reliable data on short-term dynamics by interviewing respondents three times a year and asking them to recall events as recently as the prior month and no more than 4 months earlier. While the potential benefits of a shortened reference period are obvious, frequent interviews also create more opportunities for erroneous reports. For example, respondents who are uncertain about the type of health insurance coverage they have may give different responses in different waves. In addition, a sample member may self-report in one wave but have his or her data reported by a proxy respondent in the next wave. A misreported status in one wave creates two false transitions, compounding the effects of a single error. Lacking a good benchmark for assessing the reliability of reported transitions, researchers have produced very little evidence regarding the quality of transitions in health insurance coverage in the SIPP. It is true, for example, that SIPP finds a higher proportion of the population who were ever without health insurance coverage during a 12-month period than does the NHIS, which relies on a retrospective question (Czajka and Denmead, 2008). Does this suggest that SIPP may be overestimating periods without coverage, or does the difference between the two surveys simply reflect SIPPâs better design for estimating incidence over time? The shorter recall required in SIPP suggests the latter, but SIPP also finds a higher
146 REENGINEERING THE SURVEY proportion of people ever uninsured during a year than the longitudinal Medical Expenditure Panel Survey (MEPS), which, like SIPP, conducts multiple interviews over a year. MEPS differs from SIPP in having a vari- able reference period. As yet, there is no definitive answer to the question of which survey is more correct. However, the high rate at which SIPP sample ÂmembersâÂparticularly childrenâtransition between the uninsured and insured does raise questions about the SIPP estimates. In the 2001 SIPP panel, 44 percent of uninsured children gained coverage between one wave and the next. This compares with 23 percent for uninsured adults ages 19 to 39 and 19 percent for uninsured adults ages 40 to 59 (Czajka and Mabli, 2009). A review of coverage transitions measured in SIPP found instances of improbable transitions that appeared likely to be reporting errorsâ for example, children losing and regaining employer-sponsored coverage through a parent who reported continuous coverage over the same period. Edits to remove improbable transitions such as these reduced the estimated number of one-wave uninsured spells among children by 52 percent (Czajka and Sykes, 2006). The reductions among adults were smaller: 31 percent for adults ages 19 to 39 at the start of the wave and 22 percent among adults ages 40 to 64. Clearly, the frequency of brief uninsured spells and reported transitions into and out of the uninsured status in SIPP should be a matter of concern among users. However, data sources that can provide accurate reports of monthly status and be linked to the SIPP are few in number, and only the Census Bureau is legally able to produce the linkages that will support such research. ATTRITION In addition to nonresponse at the initial interview, which has tended to be quite low in comparison to other household surveys, SIPP as a lon- gitudinal survey is subject to attrition of sample members over time. The bias that attrition may introduce into survey estimates makes the level of attrition a serious concern. Sample Loss Table A-6 documents both incremental and cumulative sample loss due to nonresponse by wave in each of the four SIPP panels that started â Such data sources include Medicaid administrative files from the Medicaid Statistical InforÂmation System, which have been linked to the CPS and NHIS but not SIPP, and ÂInternal R Â evenue Service Forms 5500 filed by employers and processed by the U.S. Department of Laborâs ÂEmployee Benefit Security Agency (regarding the latter, see Decressin, Hill, and Lane, 2006).
APPENDIX A 147 TABLE A-6â Incremental and Cumulative Household Sample Loss Rates by Wave: 1992, 1993, 1996, and 2001 SIPP Panels, Unweighted (percentage) Incremental Sample Loss Rate Cumulative Sample Loss Rate 1992 1993 1996 2001 1992 1993 1996 2001 Wave Panel Panel Panel Panel Panel Panel Panel Panel 1 9.3 8.9 8.4 13.3 9.3 8.9 8.4 13.3 2 5.3 5.3 6.1 8.6 14.6 14.2 14.5 21.9 3 1.8 2.0 3.3 2.8 16.4 16.2 17.8 24.7 4 1.6 2.0 3.1 1.2 18.0 18.2 20.9 25.9 5 2.3 2.0 3.7 1.6 20.3 20.2 24.6 27.5 6 1.3 2.0 2.8 0.7 21.6 22.2 27.4 28.2 7 1.4 2.1 2.5 0.7 23.0 24.3 29.9 28.9 8 1.7 1.2 1.4 1.4 24.7 25.5 31.3 30.3 9 1.5 1.4 1.5 1.6 26.2 26.9 32.8 31.9 10 0.4 1.2 26.6 34.0 11 1.1 35.1 12 0.4 35.5 NOTE: The household sample loss rate expresses the number of noninterviews among eligible households in a given wave as a percentage of the total eligible households in that wave. E Â ligible households include those that the Census Bureau continues to attempt to interview as well as those that have been dropped from further interview attempts in keeping with SIPP field procedures but remain within the SIPP universe. Households dropped from further interÂ view attempts include nonrespondents to the Wave 1 interview as well as households that were interviewed in Wave 1 but missed two or three consecutive interviews (depending on the reason) or moved too far from a SIPP primary sampling unit. All noninterviewed households (except those known to have left the survey universe) are multiplied by a growth factor to reflect a crude estimate of households splitting to form multiple households less those leaving the SIPP universe. Beginning with Wave 4 of the 2001 panel, households are no longer dropped from further interview attempts because they missed consecutive interviews. SOURCE: Eargle (2004). between 1992 and 2001. The estimates of sample loss apply to eligible households. If at least one member of an eligible household responds during a given wave, the Census Bureau collects or imputes data for every other household member. An eligible household contributes to the estimate of sample loss in a given wave if no interview is conducted with any member of that household. While initial nonresponse declined slightly over the 1992, 1993, and 1996 panels, it jumped nearly 5 percentage points between the 1996 and 2001 panels, rising from 8.4 to 13.3 percent.10 This increase in household 10â An incentive experiment that paid $10 or $20 to about half the Wave 1 sample households in the 1996 panel contributed to the low sample loss in Wave 1 and subsequent waves (see James, 1997).
148 REENGINEERING THE SURVEY nonresponse did not begin with the 2001 panel, however. The incremental sample loss rate for every wave after the first rose between the 1993 and 1996 panels. At the end of Wave 9, the cumulative sample loss rate for the 1996 panel stood at 32.8 percent versus 26.9 percent in the 1993 panel. The 1996 panel ran three additional waves, but the cumulative sample loss grew by less than 3 percentage pointsâto 35.5 percentâover those three waves. For comparison purposes, Table A-7 reports nonresponse rates to the CPS ASEC supplement and the labor force survey conducted in the same month.11 Some households that complete the monthly labor force sur- vey do not respond to the supplement. Historically, nonresponse to the monthly labor force survey has been very low. Noninterview rates deviated little from 4 to 5 percent of eligible households between 1960 and 1994 but then began a gradual rise that coincided with the introduction of a redesigned survey instrument using computer-assisted interviewing (U.S. Census Bureau, 2002). By March 1997, the first data point in Table A-7, the noninterview rate had reached 7 percent, but it rose by just another per- centage point over the next 7 years. Over this same period, nonresponse to the ASEC supplement among respondents to the labor force survey ranged between 8 and 9 percent, with no distinct trend, yielding a combined sam- ple loss that varied between 14 and 16 percent of the eligible households. In other words, the initial nonresponse to the 2001 SIPP panel is still 2 to 3 percentage points lower than the nonresponse to the ASEC supplement. But as a measure of how much the SIPP response rates have declined, it took two waves of cumulative sample loss in the 1996 panel to match the nonresponse to the ASEC supplement. A SIPP practice dating back to the start of the survey bears some responsibility for the amount of sample loss after Wave 3 in panels prior to 2001. Households that missed two or three consecutive interviews (depend- ing on the circumstances) were dropped from further attempts. The prin- cipal purpose, initially, was to ensure that all missing waves would be bounded by complete waves, so that the missing waves could be imputed from the information collected in the surrounding waves. Missing wave imputations were performed for the first time in the early 1990s but were discontinued with the 1996 redesign. With rising attrition and the removal of the principal rationale for dropping respondents after two missing waves, the Census Bureau revised this practice during the 2001 panel. Respondents are no longer dropped after missing two consecutive interviews. The impact 11â Until 2001 the CPS supplement that collects annual income was conducted solely in March of each year, but as part of a significant sample expansion, the Census Bureau began to administer the supplement to CPS sample households in February and April that were not interviewed in March.
APPENDIX A 149 TABLE A-7â Nonresponse to the CPS Labor Force Survey and ASEC Supplement, 1997 to 2004 Percentage of Eligible Households Not Percentage of Labor Percentage of All Responding to the Force Respondents Not Eligible Households Sample Labor Force Responding to the Not Responding to the Year Â Questionnaire Â Supplement Â Supplement 1997 7.2 9.2 15.7 1998 7.8 7.2 14.4 1999 7.9 8.9 16.1 2000 7.0 8.0 14.4 2001 8.0 8.5 15.9 2002 8.3 8.6 16.2 2003 7.7 8.0 15.0 2004 8.5 8.2 16.0 NOTE: March 1997 is the first supplement for which the CPS technical documentation reports rates of nonresponse. The nonresponse rate in column 3 is the sum of the nonresponse rate in column 1 and the product of the nonresponse rate in column 2 (divided by 100) and 100 minus the nonresponse rate in column 1. SOURCE: Current Population Survey Technical Documentation, various years. of the new policy is evident in the incremental sample loss rate between Waves 3 and 4, which dropped to 1.2 percent from a level of 3.1 percent in the 1996 panel. By Wave 7 the cumulative sample loss had fallen below that of the 1996 panel, which meant that the survey had retained enough additional sample members to offset both the 5 percentage point higher Wave 1 nonresponse rate and higher attrition between Waves 1 and 2. The 2001 panel maintained a lower cumulative sample loss through the remain- ing two waves. Interestingly, the incremental sample loss rates between Waves 8 and 9 were essentially identical across the four panels at about 1.5 percent. Attrition Bias Numerous studies with SIPP and other panel surveys have documented that attriters differ from continuers in a number of ways (see, for example, Fitzgerald, Gottschalk, and Moffitt, 1998; Zabel, 1998). Most studies of attrition bias have been limited to comparing attriters and continuers with respect to characteristics measured at the beginning of the survey, before any attrition has occurred. Such studies cannot say how much the attriters and continuers differ on characteristics subsequent to attrition, which is critical to knowing how longitudinal analyses may be affected by attri- tion. Another limitation of such studies that is rarely noted is that they
150 REENGINEERING THE SURVEY assume that the quality of the data provided by those who will later leave is comparable to that provided by those who remain in the panel through its conclusion. For many characteristics, this assumption is probably valid. But for sensitive characteristics or those that respondents might view as onerous to provide, the validity of the assumption is questionable. Yet another limitation of many attrition studies is that they fail to separate nonÂ respondents who left the survey universe from those who remained eligible to be interviewed. Persons who leave the survey universeâby dying, joining the military, becoming institutionalized (including incarceration), or mov- ing outside the countryâhave distinctly different characteristics than those who remain in the universe. Administrative records linked to survey data can overcome these limi- tations. Administrative records can provide data on postattrition and even presurvey characteristics, and the values of the characteristics are recorded with very little error, generally. Moreover, any measurement error in the characteristics obtained from administrative records will be independent of attrition status. Finally, most nonrespondents who left the survey universe are identified in SIPP and can be removed from the sample of attriters. Some who cannot be identified in the survey data may drop out of analyses automatically because their administrative records terminate at some point after they have left the survey universe. Vaughan and Scheuren (2002) used Social Security Administration Summary Earnings Records matched to SIPP panel data to compare attriters and continuers with respect to earnings and program benefits over time.12 Even after removing those who left the survey universe, they found that attriters and nonattriters differed markedly with respect to earnings and receipt of program benefits at the beginning of a panelâthat is, before any attrition had occurred. Over time, however, these differences attenuated. With enough passing years (longer than a typical SIPP panel, however), the characteristics of those who left and those who continued to respond to the survey converged. This trend suggests that compensating for the impact of attrition on cross-sectional estimates becomes both easier and less impor- tant over time. But the fact that the differences are large to begin with and then diminish over time also implies that attriters experience greater change than nonattriters. Vaughan and Scheuren (2002) concluded that compensat- ing for the attrition bias in estimates of gross change is both important and much more difficult than compensating for differences in net change. To evaluate the effectiveness of the Census Bureauâs nonresponse adjustments, Czajka, Mabli, and Cody (2008) used administrative data 12â Vaughan and Scheuren (2002) examined attrition in the Survey of Program Dynamics, which was selected from the 1992 and 1993 SIPP panels, and continued to interview respon- dents through 2002.
APPENDIX A 151 from the same sources as Vaughan and Scheuren but compared the full sampleâusing a Wave 1 cross-sectional weightâwith the subsample of continuers weighted by the full panel weight, which incorporates adjust- ments for differential nonresponse.13 They found little evidence of bias in estimates of earnings, Social Security beneficiary status and benefit amounts, or SSI beneficiary status and benefit amounts at different points in time. Nor did they find significant bias in selected estimates of change in these characteristics. The implication is that attrition bias in these characteristics is being addressed in the longitudinal weights. It is not possible to evalu- ate the Â Census Bureauâs adjustments to the cross-sectional weights in the same manner as the longitudinal weights, as there is no attrition-free cross- s Â ectional sample after the first wave. Furthermore, other, lesser known biases due to attrition are not addressed by the weights. For example, Czajka and Sykes (2006) documented attrition bias among new mothers, which contributes to a severe underestimate of the number of infants if the weights of mothers are assigned to their newborn children.14 Attrition by new mothers has been documented in the National Longitudinal Survey of Youth 1997 as well, although, in that survey, becoming a parent was found to be very highly related to returning to the survey after missing an interview (Aughinbaugh and Gardecki, 2007). REPRESENTATION OF THE POPULATION OVER TIME Although SIPP is fundamentally a panel survey, cross-sectional applica- tions (including analysis of repeated cross-sections) abound and may in fact be more common than true longitudinal uses of the data. For this reason, it is important that users understand the limits to the surveyâs representation of the population over time. While the U.S. population is currently growing at a rate of less than 1 percent a year, this net growth is the difference between substantially larger inflows and outflows. SIPP panel members who leave the sample by dying, entering institutions, moving abroad, or moving into military barracks rep- resent the outflows from the population. A priori, there is no reason to think that SIPP underrepresents, overrepresents, or otherwise misrepresents the gross outflows from the population, although one could certainly speculate that respondents who know that they are moving abroad or entering institu- tions may leave before being identified as leaving the survey universe. 13â Persons who leave the SIPP universe are assigned panel weights if they missed no prior interÂ views. Such persons will have contributed to both the full sample and the panel estimates. 14â SIPP longitudinal weights are not assigned to persons entering the sample after the calen- dar month to which the weights are calibrated. It is common among users to assign infants the weights of their mothers.
152 REENGINEERING THE SURVEY To maintain full cross-sectional representativeness over time, how- ever, a panel survey must also obtainâperiodically if not continuouslyâa representative sample of new entrants to the population. New entrants include births, immigrants, and persons returning from abroad. Because SIPP excludes residents of specific types of group quarters (prisons, nurs- ing homes, and military barracks, primarily), new entrants also include persons moving from such quarters into households. SIPP captures births to panel members and, through this mechanism, represents most births to the population over the length of a panel, but its capture of other new entrants is limited to persons moving into households with original sample members. That is, SIPP represents those additional new entrants who join households containing persons who were in the SIPP universe at the start of a panel. SIPP does not represent people who enter or reenter the U.S. civilian noninstitutionalized population if they form new households or join households populated by people who have also joined the population since the start of the panel. What fraction of new entrants other than births is represented in SIPP is unknown and not readily discernible. New entrants are not identified explicitly in the SIPP public-use data files, and, even if they were, none of the SIPP weights is designed to properly reflect their contribution to the population. An estimate of the total new entrant popu- lation, exclusive of births, near the end of the 1996 SIPP panel placed it at about 10 million, or more than 3 percent of the total population (ÂCzajka and Sykes, 2006). This estimate represents how many persons, other than those born to panel members, were in the civilian noninstitutionalized population at the end of the 1996 panel but had not been in the population at the start of the panel. To facilitate cross-sectional uses of SIPP data, the Census Bureau pro- vides monthly cross-sectional weights. These weights include an adjustment for differential attrition and a separate âmover adjustment,â which offsets the weights assigned to persons who join SIPP households. In addition, the cross-sectional weights are poststratified to monthly estimates of the civilian noninstitutionalized population by age, gender, race, and Hispanic origin. This poststratification to demographic controls is a limited attempt to make the SIPP sample consistent with changes in the size and composition of the civilian noninstitutionalized population over time. PostÂstratification ensures that the monthly SIPP cross-sectional weights will sum to the C Â ensus Bureauâs estimates of monthly population totals by age, gender, race, and Spanish origin. It does not ensure that the broader characteristics of the weighted sample will remain consistent with the population over time if the net effect of the gross inflows and outflows is to change the charac- teristics of the population. The implications of these population flows for the representativeness of the SIPP cross-sectional sample over time is unknown, and the issue
APPENDIX A 153 has attracted very little interest. But analysis of the characteristics of SIPP sample members who move out of the population over time indicates that these people differ dramatically from nonmovers with similar demographic characteristics (particularly those of Hispanic origin). This implies a poten- tial for persons moving into the population to differ dramatically as well (Czajka, 2007). Within-panel trends that have been attributed to attrition could very well be owing to the panelâs increasingly less complete repre- sentation of the national population over time as the new entrants omitted from the SIPP grow from zero to as much as 3 percent of the total popula- tion. If so, then a new strategy for weighting SIPP that takes account of the new entrants who are not represented by the survey could improve the quality of inferences supported by the data.15 SEAM BIAS Seam bias describes a tendency for transitions to be reported at the seam between survey wavesâthat is, between month 4 of one wave and month 1 of the next waveârather than within waves. Evidence of seam bias was first identified in analyses of the Income Survey Development Program research panels that preceded the SIPP (Callegaro, 2008). Multiple causes have been suggested, and the causes appear to be multiple in nature. The extent of seam bias varies markedly across items, which may reflect differ- ent mixes of causes. SIPP users have adapted their analytical strategies. It is common for those examining behavior over time to take only one data point per waveâeither the one calendar month that is common to all four rotation groups or the fourth reference month, which is widely viewed as the most reliable because of its proximity to the interview month. The inference is that there is not enough independent information in the other three months to make them analytically useful or that analysts do not know how to use the limited additional information that they provide. The Census Bureau has tried two alternative approaches to dealing with seam bias: (1) collecting selected data for the interview month as a fifth reference month, which will overlap the first reference month of the next wave, and (2) dependent interviewing. It remains unclear what the Census Bureau has learned from collecting the additional month of data. These data are not 15â If the survey with its current cross-sectional weights underestimates poverty in the full population, for example, because it underrepresents 10 million people with a very high p Â overty rate, then one strategy would be to exclude the 10 million from the weighted popula- tion total so that the poverty rate estimated from the survey provides a better reflection of the population to which the weights sum. An alternative strategy, if the characteristics of the 10 million can be known sufficiently well to be replicated within the existing survey sample, is to revise the cross-sectional weighting of the sample to better reflect the characteristics of the total population.
154 REENGINEERING THE SURVEY released on the public-use file, and it is not apparent that the Census Bureau has made use of this information in editing responses, which might have moved the seam by one month but not reduced it. However, the Census Bureau appears to have had some success with dependent interviewing, in which respondents who reported participation in a program at the end of the previous wave are informed of their prior wave response and asked if they were still participating 4 months earlier. Specifically, dependent inter- viewing has helped to lower the frequency of transitions at the seam by reducing the number of reported transitions rather than shifting their loca- tion (Moore et al., 2009). However, dependent interviewing has given rise to other problems during its application to the 2004 panel, and the Census Bureau has suspended its use in SIPP. IMPUTATION Item nonresponse is higher on income questions than on most other types of questions.16 Since the start of SIPP, item nonresponse to income questions in surveys has increased dramatically. This is reflected in the pro- portion of total income that is imputed. Growth of Imputation Over Time In 1984, just 11.4 percent of total money income in SIPP was imputed (Vaughan, 1993). Even then, however, imputation rates varied widely across income sources. Income imputation was lowest for public assistance (7.5 percent) and highest for property income (23.9 percent). The single highest imputation rate occurred for dividends (46.8 percent), a compo- nent of property income. The imputation rate for wage and salary income was among the lowest at 8.8 percent. Imputation rates in the CPS were higherâin large part because the Census Bureau imputes the entire ASEC supplement for respondents who complete only the brief monthly labor force survey that precedes the supplement. In March 1985, 20.1 percent of total CPS ASEC income for 1984 was imputedâincluding 17.9 percent of wage and salary income. Between 1984 and 1993, imputation rates for SIPP income increased substantially, growing to 20.8 percent for total income and 17.7 percent for wages and salaries, or double the rate in 1984 (see Table A-8). The imputa- tion rate for property income, 42.4 percent, approached the very high level recorded by dividends in 1984. The low imputation rate for public assis- tance as a whole grew to more than 13 percent for SSI and welfare. 16â Item nonresponse on asset questions is even higher.
APPENDIX A 155 TABLE A-8â Proportion of Income Imputed, by Source: SIPP and CPS, Selected Years Â Survey Reference Year Income Source 1993 1997 2002 SIPP Total Income 20.8 24.0 28.6 Wages and salaries 17.7 20.5 24.9 Self-employment 29.3 32.7 36.4 Property income 42.4 42.9 49.7 Social Security and Railroad Retirement 22.6 22.7 28.8 Supplemental Security Income 13.2 16.4 22.6 Welfare income 13.8 31.2 32.8 Other transfers 20.8 33.0 33.6 Pensions 23.7 37.3 47.3 CPS Total Income 23.8 27.8 34.2 Wages and salaries 21.5 24.8 32.0 Self-employment 34.6 39.5 44.7 Property income 42.4 52.8 62.6 Social Security and Railroad Retirement 24.1 27.9 35.5 Supplemental Security Income 22.9 19.7 28.0 Welfare income 19.8 18.1 29.2 Other transfers 23.3 23.9 31.4 Pensions 24.2 27.0 35.4 SOURCE: The 1992, 1996, and 2001 SIPP panels and the 1994, 1998, and 2003 CPS ASEC supplements. Between 1993 and 2002, the proportion of total income that was imputed increased by 8 percentage points. The increase in imputation rates by income source was very uneven. The income imputation rates for welfare, other transfers, and pensions surged between 1993 and 1997. For welfare, the imputation rate more than doubled, rising from 14 to 31 percent. For other transfers and pensions, the imputation rates increased by more than half, reaching 33 percent for other transfers and 37 percent for pensions. Yet there was no increase in the already high imputation rate for property income, and the imputation rates for wages and salaries and self-employment income increased by only 3 percentage points. Between 1997 and 2002, the imputation rate for pension income grew another 10 percentage points, taking it very near the imputation rate for property income, which grew by 7 percentage points to nearly 50 percent. Imputa- tion rates for both wages and salaries and self-employment income grew by an additional 4 percentage points.
156 REENGINEERING THE SURVEY Income imputation rates in the CPS grew more modestly than those in SIPP between 1984 and 1993 but then increased by 11 percentage points between 1993 and 2002. Imputation rates for all but two sources increased by about the same amount. The exceptions were property income, for which the imputation rate increased by 22 percentage points to 62.6 per- cent, and SSI, for which the increase was only 5 percentage points. Quality of Imputation The growing share of income that is imputed in these surveys makes it increasingly important that the imputations be done well. Both SIPP and the CPS have relied heavily on flexible hot-deck imputation procedures to impute missing items. Hot-deck imputation procedures replace missing values with values selected from other recordsâcalled donorsâthat are matched on a prespecified set of characteristics that form a large table. Flexible hot-deck procedures can combine the cells of a table, as neces- sary, to find donors when many of the cells are empty. Nevertheless, when item nonresponse is highâas it is for income and assetsâthe amount of collapsing that may be required to achieve matches reduces the quality of the imputations. While the hot-deck algorithms that the Census Bureau employs can incorporate a large number of potentially relevant variables, the variables used to match donors to the records being imputed are not tailored, gener- ally, to the items being imputed. For example, Doyle and Dalrymple (1987) demonstrated that by not taking into account reported Food Stamp Program benefits when imputing major components of income or by not taking account of income eligibility limits when imputing FSP benefits, the Census Bureau was imputing FSP benefits to households with incomes well beyond the eligibility limits or imputing high incomes to households that reported the receipt of FSP benefits. In response, the Census Bureau made improve- ments to address this particular problem as well as other related problems. With the 1996 redesign and the need to rewrite numerous programs to run on the expanded, reformatted file, some of these enhancements appear to have been lost. In January 2003, for example, SIPP estimated that more than 400,000 adult FSP participants were in families with incomes four times the poverty level. FSP receipt was imputed to 62 percent of these persons compared with less than 7 percent of the estimated 6.3 million FSP participants with family incomes below poverty (Beebout and Czajka, 2005). This suggests that the Census Bureau is not taking sufficient account of income when imputing FSP receipt. Similarly, $1.1 billion in welfare income was imputed in SIPP to families in the top income quintile in 2002 (Czajka, Mabli, and Cody, 2008). More than a third of all imputed welfare dollars went to families in the top income quintile in that year. This is com-
APPENDIX A 157 parable to only $10 million in welfare income imputed to the top income quintile in the CPS in the same year, or less than 1 percent of total imputed welfare dollars. In the years immediately preceding the 1996 redesign, the amounts of welfare income imputed to families in the top quintile were similar between SIPP and the CPS. WAVE 1 BIAS Since the redesign, each new SIPP panel (1996, 2001, and 2004) has started with a monthly poverty rate that was at least 2 percentage points higher than the poverty rate in the final wave of the preceding panel ( Â Czajka, Mabli, and Cody, 2008). Undoubtedly, a number of factors con- tribute to this result, but one that has emerged with the most recent panels involves a possible understatement of income in Wave 1. Both the 1996 and 2001 panels showed a percentage point decline in the poverty rate between the first and second waves. In the 1996 panel, poverty continued to decline in the presence of an expanding economy, but in the 2001 panel there was no further decline in the poverty rate after the second wave. In the 2004 panel the Wave 1 to Wave 2 reduction was nearly 2 percentage points. S Â easonal swings in income provide an obvious explanation, but the 1996 panel started 2 months later in the year than the 2001 and 2004 panels. Panel surveys may be subject to a âtime-in-sampleâ bias. Through repeated interviews, respondents may become better respondents as they learn what is expected of them. They may also become bored or learn how to avoid lengthy segments of the interview. Prior to the 1996 redesign, the Census Bureau compared data from overlapping waves in successive panels in a search for evidence of a time-in-sample bias in the reporting of income and benefit receipt in the SIPP. The research yielded no evidence of time-in-sample bias in SIPP (Lepkowski et al., 1992). With the elimination of overlapping panels, it is not possible to replicate this research on more recent SIPP data. While there may be no evidence of a time-in-sample bias in earlier SIPP panels, there is a strong suggestion of some type of change in the reporting or perhaps processing of income data between the first two waves of more recent panels. Czajka, Mabli, and Cody (2008) compared poverty status between the first two waves of the 2004 panel in an effort to determine what role attrition and other sample loss might have played in the 1.8 percentage point decline in poverty. They found that changes in recorded poverty among persons present in both waves accounted for 87 percent of the net reduction in the number of poor between the two waves. Between Waves 2 and 3, a much smaller reduction in the number of poor (0.3 percentage points) could be attributed in large part to fewer gross exits from povertyâthat is, fewer sample families reporting increased incomes.