Due to the unique aspects of the data it collects, the Survey of Income and Program Participation (SIPP) has occupied over its history several niches with respect to research applications. The features most reflected in its uses include
- detailed monthly data,
- representation of short-term dynamics, and
- topical modules (TMs).
In addition, SIPP was intended to serve as a frame for the linkage of program administrative records. To serve this end, SIPP initially collected Social Security numbers (SSNs) from respondents. The collection of SSNs was eliminated in the early 2000s in favor of a more effective approach that the U.S. Census Bureau implemented across all of its household surveys. Over SIPP’s history a significant number of SIPP researchers have made use of the survey’s linkage capabilities. Enhanced linkage capabilities made possible with the innovations developed by the Census Bureau made the collection of SSNs unnecessary and improved the availability of a growing number of administrative files through the Federal Research Data Centers. Studies using SIPP data linked to administrative records continue to grow.
For each month of its reference period, SIPP captures hundreds of variables for each adult member (age 15 and older) of its sampled households. This detailed monthly information enables SIPP to support both cross-sectional applications (in particular, modeling of program eligibility, which is typically determined on the basis of characteristics measured in a given
month) and longitudinal applications (specifically, month-to-month dynamics) over periods of 1 to 5 years. SIPP’s TMs, some occurring annually while others appeared only once per panel, have been used in two principal ways: (1) to support detailed analyses of specific topics, such as disability or child care; and (2) to provide key variables—such as assets and child care expenses—for monthly simulations of program eligibility.
In this chapter, the report reviews the major uses of SIPP data as reflected in the recent research literature and in applications of SIPP data by federal agencies.
To assess SIPP’s recent contributions to the research literature, the panel commissioned a literature search by the National Academies of Sciences, Engineering, and Medicine’s Research Center. The search was restricted to works with publication dates of 2000 through the time of the search (mid-2014). Six databases were searched. These included Scopus and Web of Science (both containing science citations), the National Bureau of Economic Research, Proquest (covering the social sciences), IDEAS/REPEC (the largest bibliographic database dedicated to economics), and WorldCat (a global library catalog). The search identified 638 unique works, consisting of journal articles, books and book sections, working papers, conference proceedings papers, reports, and doctoral dissertations. Because the panel’s purpose is to show usage of the data outside of the Census Bureau, the list does not include papers in the Census Bureau’s SIPP Working Papers series, which numbered 263 dating back to 1984, nor does it include SIPP-based reports published by the Census Bureau.
Reflecting the breadth of research that SIPP has supported, Table 3-1 provides counts of citations for 26 general topic areas. Most citations covered more than one topic, but the count reflects only one topic per citation from the 638 references. Only two of the topics—income dynamics and transitions—explicitly invoke SIPP’s longitudinal design, but many of the topics include examples of longitudinal analysis that take advantage of SIPP’s unique monthly structure. These include Cutler and Gelber (2009) and Hill and Shaefer (2011) (listed under State Children’s Health Insurance Program [SCHIP or CHIP]) on health insurance; Baldwin and Schumacher (2002) and Powers (2003) on disability; Berger and colleagues (2004) and Ribas and colleagues (2012) on employment; Glick and van Hook (2011) and London (2000) on household structure; Lovell and Oh (2005) on material hardship; Blumberg and colleagues (2000) and Hwang and colleagues (2012) on Medicaid; Hisnanick (2007) and McKernan and Ratcliffe (2005) on poverty; Acs and colleagues (2005) and Cody and colleagues (2007) on program participation; Noonan and Heflin (2005) on earnings; Haverstick and colleagues
|Topic||Number of Citations|
|Health, Health Care, and Insurance||48|
NOTES: AFDC = Aid to Families with Dependent Children, CHIP = Children’s Health Insurance Program, SCHIP = State Children’s Health Insurance Program, SNAP = Supplemental Nutrition Assistance Program, SSI = Supplemental Security Income, TANF = Temporary Assistance for Needy Families, WIC = Special Supplemental Nutrition Program for Women, Infants, and Children.
(2010) on retirement; Buchmueller and colleagues (2014) and Coburn and colleagues (2002) on the SCHIP or CHIP; Slud and Bailey (2010) and Rips and colleagues (2003) on survey methodology; Mabli and Ohls (2012) and Newman (2006) on the Supplemental Nutrition Assistance Program (SNAP, also known as food stamps); Duggan and Kearney (2007) on Supplemental Security Income (SSI); Cawley and colleagues (2006) and Fitzgerald and Ribar (2004) on welfare reform; and Gundersen (2005) on the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC). Still, the vast majority of works identified in the literature search use cross-sectional (including repeated cross-sections) rather than longitudinal features of SIPP.
The study panel did not attempt to compare the contributions to academic research of SIPP with contributions from other important longitudinal surveys such as the National Longitudinal Surveys, the Panel Study of Income Dynamics, the Health and Retirement Study, and the Wisconsin Longitudinal Survey. The report has no findings to report in this area. The study panel believes, however, that the literature review showed a narrow slice of the total academic work based on SIPP data. A simple Google Scholar search on August 8, 2017, on the exact survey name (Survey of Income and Program Participation), anywhere in the article, between 2000 and 2014, yielded 9,560 hits.
To obtain a more complete and detailed picture of the uses of SIPP by the federal government, the study panel interviewed staff in four federal agencies. The panel supplemented this information with members’ own direct knowledge of SIPP applications. This section highlights the diversity of issues that are addressed by SIPP data and that are critical to administering federal programs. It also illustrates how SIPP data are used in complex integrations with multiple sources of survey data and administrative data sources to address these issues.
U.S. Social Security Administration
The Social Security Administration (SSA) is an extensive user of SIPP, so the report discusses its uses in considerable detail. The detail helps describe the extensive use of SIPP data in complex combinations with other data sources. These complexities would be difficult to replicate if critical SIPP data were no longer available. The Office of Research, Evaluation, and Statistics in the SSA uses SIPP in three main ways: (1) to describe the agency’s beneficiary population; (2) to conduct policy analysis based on microsimulation of the Old-Age, Survivors, and Disability Insurance (OASDI) and SSI programs; and (3) to conduct other research.
Describing the Beneficiary Population
To supplement the limited data on beneficiary characteristics captured in SSA administrative records, SSA researchers produce tabulations of SSA program beneficiaries from the Current Population Survey’s Annual Social and Economic Supplement (CPS ASEC) and SIPP. In the 1990s, SIPP and CPS ASEC tabulations were published in the Annual Statistical Supplement to the Social Security Bulletin. Since then, each set of periodic tabulations from SIPP has been published as a Research and Statistics Note. The most recent tabulations use SIPP data from the 2008 panel to describe the characteristics of noninstitutionalized participants in the Disability Insurance (DI) and SSI programs in 2013 (Bailey and Hemmeter, 2015). SIPP data are from wave 15, the final 2013 panel wave in which all four rotation groups participated. The next round of SIPP tabulations for DI and SSI beneficiaries will have to be based on the 2014 panel.
Beneficiary status was determined in month 4 of the 4-month reference period for each rotation group (that is in April through July of 2013). To identify DI and SSI participants most accurately, beneficiary status was obtained from matched SSA administrative records. Where matches to administrative records were not possible (because sample members could not be linked to SSNs through the Census Bureau’s probabilistic record linkage system), program beneficiaries were identified from survey reports (either self-reports or proxy reports).
Most characteristics were tabulated in the same month as the beneficiary designation. Income, however, was calculated over all 4 months of the wave 15 reference period. The full set of characteristics used in the tabulations included
- age, sex, race, and ethnicity;
- marital status, living arrangements, household type, household and family size, and the presence of children under 18;
- education, veteran status, and health insurance coverage;
- household receipt of assistance by type;
- number of Social Security recipients in the household;
- home ownership;
- personal, family, and household income;
- family income as a percent of poverty;
- the percentage of personal income due to Social Security (as reported in SIPP);
- the receipt of individual sources of personal income; and
- the distribution of family income by source.
Of these variables, only age, sex, and living arrangement (i.e., living in one’s own household, another’s household, or an institution) are replicated in the administrative records. The administrative records also include citizenship.
Microsimulation Modeling with SIPP Data
For its policy simulations, the SSA maintains two microsimulation models, which are used to assess the impact of hypothetical reforms to SSA programs.1 Both models use SIPP data matched to SSA administrative records to derive their forecasts, which include estimates of program eligibility and participation and measures of economic well-being under baseline and reform scenarios. Modeling Income in the Near Term (MINT) focuses on the retirement behavior of the baby boom and later cohorts. The Financial Eligibility Model (FEM) focuses on the low-income and disabled populations served by SSI, Medicaid, and Medicare. Both models provide critical insights into the impact of proposed reforms to address future shortfalls in funding for the SSA and also for Centers for Medicare & Medicaid Services programs.
MINT (Modeling Income in the Near Term). This is a dynamic microsimulation model that projects into the future a database consisting of SIPP data matched to Social Security administrative records (Butrica et al., 2003/2004). The latest version of the MINT model, MINT7, uses data from the 2004 and 2008 SIPP panels matched to administrative records through 2010 (Smith and Favreault, 2013).2 SIPP data include the information collected through wave 7 of each panel for persons born between 1926 and 1979. In addition to core data from each wave, MINT7 uses data from a number of TMs: employment history, marriage history, fertility history, migration history, disability history, health conditions and work limitations, retirement and pension plan coverage, assets and liabilities, annual income and retirement accounts, medical expenses and utilization of health care, and employer-provided health benefits. The SSA administrative files that are matched to SIPP records include the Detailed Earnings Record; the Summary Earnings Record (SER); the Master Beneficiary Record, which contains information on the receipt of Social Security benefits (e.g., OASDI); the Supplemental Security Record (SSR), which contains information on the receipt of SSI benefits; and the Numident, which includes the place of
2 The Division of Policy Evaluation within the Office of Research, Evaluation, and Statistics in the SSA developed MINT with the assistance of contractors (the Urban Institute, the Brookings Institution, and RAND Corporation) and contributions from a number of additional individuals, as well as several advisory boards.
birth, date of death, and the year of entry and legal status at entry for the foreign-born who obtained SSNs. To project the database forward, MINT uses data from several federal surveys, including the Panel Study of Income Dynamics, the Health and Retirement Study, the Survey of Consumer Finances, and the Medical Expenditure Panel Survey. Projections are calibrated to the 2012 Social Security Trustees’ intermediate assumptions for price and wage growth and key parameters of Social Security benefits and are also informed by the intermediate assumptions for employment rates, life expectancy, net immigration, and DI receipts.
Using an earlier version of MINT, Butrica and colleagues (2003/2004) compared the projected characteristics at age 67 of four 10-year birth cohorts: current retirees (born between 1926 and 1935), near-term retirees (1936 through 1945), early baby boomers (1946 through 1955), and late baby boomers (1956 through 1965). Measures of economic well-being were calculated for the median 10 percent of individuals in each group of cohorts. These measures included per capita family income, poverty rates, income replacement rates (post-retirement income as a proportion of preretirement income), and the contributions of different sources of retirement income and nonretirement income to retirees’ overall replacement rates. This type of analysis could be repeated with simulations of various reforms to the Social Security program and with a wide variety of outcome measures, including effects on the Social Security Trust Fund (see, e.g., Springstead, 2011).
FEM (Financial Eligibility Model). The SSI program provides income support to low-income individuals who are 65 and older or severely disabled. Microsimulation modeling of SSI eligibility is a useful strategy for estimating the impact of actual or hypothetical changes to the program rules. Potential impacts include changes in the number of eligible and participating individuals and their distribution of benefits. Building on earlier SSI simulation modeling at the SSA, FEM is a static microsimulation model that uses monthly SIPP data that have been matched to SSA administrative data from the SSR, the SER, and the Master Beneficiary Report (Davies et al., 2001/2002; Rupp et al., 2007). The initial version of the model used SIPP data from the 1990 panel. Updates added data from the 1993 and 1996 panels. The current version of FEM uses SIPP data solely from the 2008 panel to take advantage of recent enhancements to the administrative records for the SSI program. Monthly income and demographic data from the core interview are used to identify the SSI family unit, estimate countable income, determine income eligibility for SSI benefits, and describe the eligible population. Asset data from a SIPP TM are used to assess resource eligibility. TM data on disability and work limitations are used to assess categorical eligibility among the nonelderly. The date-of-birth field from
the SER is used to identify and remove incorrect matches, while participation and benefit status from the SSR are substituted for the less-accurate self-reported SIPP data. As a static microsimulation model, FEM simulates the impact of a policy change while holding constant current demographic and economic conditions, and the policy change and its consequences are assumed to occur instantaneously.
Using an earlier version of FEM with a database drawn from the 1990 SIPP panel, Davies et al. (2001/2002) compared the characteristics of eligible participants and nonparticipants ages 65 and older and contrasted these with the characteristics of elderly ineligible individuals. They also applied the model to estimate the simulated impact of five policy changes that altered the income exclusions, the maximum benefit, and the asset eligibility thresholds. To assess the effects on participation, they used a probit model of participation to assign participation status to individuals simulated as newly eligible. Outcomes estimated at the micro level for each scenario included SSI eligibility and participation, average SSI benefits, the poverty rate, and the poverty gap. The model also estimated total SSI program costs for elderly participants.
The SSA also uses SIPP for research analyses that do not require microsimulation. For example, Iams and Purcell (2013) examined the effect of including in family income the distributions from individual retirement accounts, Keogh accounts, defined contribution pension plans, and lump-sum payments from all types of pension and retirement plans. These sources of income are excluded from SIPP’s summary measure of family income and are largely excluded from the CPS ASEC–based official measures of annual household income and poverty. The authors found that among families headed by persons ages 65 and older, 19 percent received income from these sources in 2009. Adding these sources raised the median income of these families by 18 percent and their mean family income by 15 percent. The size of the percentage increase in mean family income was inversely related to family income quartile, with a 36 percent increase in the bottom quartile and a 10 percent increase in the top quartile. Including such distributions in family income is consistent with employers’ shift from providing defined benefit pension plans (the distributions from which are counted in family income in the official poverty measure) to defined contribution pension plans, which are generally not counted.3
3 The recent revision of the CPS ASEC income questionnaire broadens the definition of retirement income by asking respondents to report all withdrawals from all types of retirement accounts.
Fisher (2007) highlighted an important difference between the CPS ASEC and SIPP in showing the degree of reliance on Social Security as a source of income among aged beneficiaries. Two SSA publications—Income of the Population 55 or Older and Income of the Aged Chartbook—report a statistic calculated from the CPS ASEC for Social Security beneficiary “units” (single individuals or couples) classified as aged, meaning that the unit included someone 65 and older. With the March 1997 CPS (based on income in 1996), the fraction of aged beneficiary units with only Social Security income was 17.9 percent, but it was less than half that level (8.5%) when estimated with SIPP for the same reference year. SIPP’s identification of more income sources for this population was the difference, but SIPP also estimated larger amounts of income. With the CPS, the fraction of aged beneficiary units receiving at least 90 percent of their income from Social Security was 30.4 percent, compared to 19.9 percent with SIPP.
Iams and Purcell (2013) reported on a possible factor contributing to the latter finding. Both surveys request gross benefit amounts from their respondents—that is, amounts received prior to subtracting the Medicare premium that many beneficiaries pay. Comparing survey reports of Social Security benefits among beneficiaries 60 and older to the actual payments recorded in Social Security administrative records, the authors found that the values reported in the CPS (for calendar year 2008) approximated the gross Social Security benefit, whereas the values reported in SIPP (for calendar year 2009) approximated the net Social Security benefit (i.e., after subtraction of the Medicare premium). The difference between the mean gross benefit and the mean net benefit at the time was about $1,000.
National Center for Education Statistics
The National Center for Education Statistics (NCES) uses SIPP as an alternative source of estimates to compare with estimates from its own surveys on closely related topics. Recent areas of comparison have included nonparental care of preschool children, child care arrangements more generally, and participation in early childhood programs. About 10 years ago, there were comparisons of numerous questions measuring educational attainment. NCES staff has worked with Census Bureau staff to develop new item wording and placement for measuring educational attainment. These changes are used in the CPS as well as in SIPP and the NCES surveys. More recently, NCES staff has worked with SIPP staff to develop new questions on adult education certification.
U.S. Government Accountability Office
The Education, Workforce, and Income Security group within the Government Accountability Office (GAO) has used SIPP core and topical module data to prepare reports related to women’s retirement security. For GAO, the SIPP TMs have proven to be especially important. The most recent report on women’s retirement security (U.S. Government Accountability Office, 2012) used TM data on annual income and retirement accounts and on retirement expectations and pension coverage. The report, which was prepared in response to a request from the Senate Committee on Aging, covered defined benefits, pension behavior by gender, the composition of retirement income (broken down by traditional pensions, Social Security, and 401K plans) and how portfolios varied by demographic characteristics. Specifically, the report used SIPP data from the 1996 through 2008 panels to (1) compare women’s and men’s access to and participation in retirement plans sponsored by their employers and how these have changed over time, and to (2) compare women’s retirement income to men’s and determine how the composition of this income responded to changing economic conditions and trends in pension design.
GAO found that from 1998 to 2009, women went from being “equally likely as men” to being “slightly more likely” to work for an employer that offered some type of retirement plan, and women went from “somewhat less likely” to “almost equally likely” to participate in a plan for which they were eligible. However, women over 65 had less retirement income on average and higher poverty rates than elderly men. The difference in poverty rates was most striking among those who were never married, and this difference grew with age.
A follow-up report examined trends in marriage and labor force participation and their projected impact on the future retirement income of both women and men (U.S. Government Accountability Office, 2014). The analysis used multiple data sources, including data from the 1996 and 2008 SIPP panels plus the restricted-use “Gold Standard” file, which is a multipanel SIPP file with matched Social Security earnings and benefits records.4 In addition to core data, the analysis used SIPP TM data on employment history and fertility history collected in wave 1 of each panel and marital history TM data collected in wave 2. The employment history and fertility history data were used to estimate how much time a woman spent out of the workforce providing child care. The marital history data were used to determine whether a woman would qualify for Social Security spousal benefits. The Gold Standard data were used to compare the char-
4 The Synthetic Beta file is a public use version of the Gold Standard file that was created by modeling the relationships in the matched file and using these models to generate synthetic records that do not replicate any actual persons.
acteristics of women who were recipients of Social Security spouse-only benefits versus retired-worker benefits or dual worker and spouse benefits. The Gold Standard file was used because its administrative data on Social Security benefit receipt and earnings were more reliable than the self-reports on the public use file. The GAO report suggested that the combination of a rising age at marriage, an increased proportion of single-parent households, and rising labor force participation among married women contributed to a decline in the fraction of women receiving spousal and survivor benefits and an increase in the fraction of household retirement savings contributed by married women. Other GAO conclusions were that (1) women with low levels of lifetime earnings and no spouse or spousal benefit may face the risk of an increased incidence of poverty in old age; and (2) the trends in marriage and single-parenting imply that the fraction of women who face that risk is likely to grow.
Congressional Budget Office
The Congressional Budget Office (CBO) uses SIPP—along with other surveys—for numerous cost estimates and projections, for example:
- CBO’s main Health Insurance Simulation Model is built around SIPP and is used for estimates and projections of Affordable Care Act coverage. SIPP was selected because of its large sample and breadth of relevant information, including coverage at a specific point in time, presence of an offer of employment coverage, detailed income data, health status, and family structure.
- CBO’s long-term microsimulation model, CBOLT, uses SIPP data linked with administrative earnings records to answer budgetary and distributional questions about Social Security, Medicare, and other programs; simulate reforms; and estimate costs and savings of proposed changes over a horizon of many decades.
- CBO often turns to SIPP to fill data gaps in miscellaneous one-time budget estimates.
- CBO also uses SIPP for research related to SNAP, disability insurance, child nutrition, and other low-income programs.
Economic Research Service
The Economic Research Service (ERS) of the U.S. Department of Agriculture uses SIPP data to aid in understanding food assistance program participation and eligibility, including SNAP and the National School Lunch Program. ERS is particularly interested in the characteristics of participants
as well as in spells and transitions. For these uses, SIPP offers several features that are not available from other surveys:
- Identification of who in the household is covered by SNAP benefits and the months in which SNAP benefits were received.
- A shorter recall period that contributes to higher quality than is found in other longitudinal surveys.
- An ability to trace households and individuals over time.
- More complete income and assets for measuring program eligibility.
- More detailed household composition and intrahousehold relationships.
- Finer temporal-grained measurement of transitions compared to the annual transitions captured in other surveys.
Items collected in the SIPP TMs play an important role in ERS’s analysis of the food assistance programs, although staff has concerns about response rates and the overall quality of some of the TM data.
Food and Nutrition Service
The Food and Nutrition Service (FNS) within the U.S. Department of Agriculture is another major SIPP user. The agency relies on SIPP as the principal data source for its policy simulations of SNAP, which are performed by its contractor Mathematica Policy Research, and for research on topics such as the short-term dynamics of participation in its nutrition programs. Based originally on the March CPS, the contractor’s SNAP microsimulation modeling has migrated largely to SIPP. The MATH SIPP+ model simulates program eligibility for a specific calendar month—currently August—using core data for the month as well as a wide range of additional variables drawn from multiple TMs attached to other waves. The TMs supply assets, dependent care expenses, shelter costs, and immigration status. Even the CPS ASEC–based model uses SIPP as the data source for imputing a variety of items not captured in the annual survey.
FNS frequently turns to SIPP to answer questions such as the following:
- How does breastfeeding cessation by WIC moms affect WIC eligibility status?
- How does the use of monthly versus annual income change understanding of who is eligible for WIC or SNAP?
- How does changing the length of certification for a category of WIC participants affect WIC eligibility status?
FNS staff stated that SIPP is the only data source suitable for addressing questions such as these. Without SIPP, such questions either could not be answered or could not be answered nearly as well.
CONCLUSION 3-1: Owing to its detailed monthly data, representation of short-term dynamics, and topical modules, the Survey of Income and Program Participation (SIPP) has supported a wide range of research applications, including both cross-sectional and longitudinal analyses as well as studies using linked administrative records. A literature search identified 638 unique works based on SIPP data between 2000 and 2014, excluding Census Bureau publications and working papers, covering 26 general topics. The panel also documented uses of SIPP data by six U.S. government agencies. Unique to SIPP is its integral role in microsimulation models used in policy analysis by three of these agencies. The Social Security Administration, Food and Nutrition Service, and Congressional Budget Office rely heavily on SIPP data in their policy models, which would be severely limited without SIPP data.
This page intentionally left blank.