Click for next page ( 92


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 91
4 Survey Design In this chapter we review and compare the current SIPP design and avail- able alternatives in light of our recommended goals for the survey. In Chapter 5 we do the same for the SIPP data collection and processing system. From both reviews we conclude that changes in the design and operation of SIPP would enhance the utility of the data and increase the cost-effectiveness of the SIPP program. MAJOR DESIGN ELEMENTS AND ALTERNATIVES The design of a continuing panel survey such as SIPP includes several components, each of which affects the quality and utility of the data and the costs of data collection, processing, and use. In this section we consider the following major design elements: the number of interviews or waves in each panel; the length of the reference period covered by each interview; the length of each panel (a function of the number of interviews and the reference period length); the frequency with which new panels are introduced; and the total initial sample size for each panel. We also consider the advantages and disadvantages of spreading out the workload by interviewing portions of the sample (called rotation groups) each month rather than interviewing the entire sample at the sense time for 91

OCR for page 91
92 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION each wave. In the last two sections we consider aspects of the SIPP sample design, namely, the use of oversampling to increase the sample size for the low-income population and the rules for following people that detains who is included in the sample for each panel over time. The major design components listed above cannot be assessed in isola- tion. They interact In a number of ways. Given a fixed budget that puts a ceiling on the number of interviews that can be fielded each year, a change in one of the design elements will generally necessitate an offsetting change elsewhere. For example, an increase in panel length must be offset by one or more of the following changes: a reduction in the frequency with which new panels are introduced, a reduction in the sample size per panel, or an increase in the reference period length for each interview wave. Current SIPP Design SIPP is a true panel survey, in that it follows individual people-including those who change their address in contrast to quasi panel surveys, such as the Current Population Survey (CPS) and Consumer Expenditure Survey (CEX), which return to the same address and interview the people who currently reside there. To obtain the sample for each SIPP panel, a list of addresses is designated for interviewers to visit in the first wave. Typically, about 75-80 percent of the addresses represent occupied housing units whose occupants are eligible for the survey; the rest are vacant, demolished, or nonresidential units. Of the eligible households, 92-95 percent of the resi- dents usually agree to participate in the survey (Bowie, 1991~. The adult members of these households (people aged 15 and over) are deemed origi- nal sample members. Each of them is followed until the end of the panel or until the person leaves the universe (e.g., by dying, entering an institution, or moving abroad) or the sample (e.g., by refusing to continue to be inter- viewed, moving to an unknown address, or moving outside the area covered by the SIPP interviewing staffs ). Children of original sample members are followed as long as they reside with an original sample adult, and adults and children who join the household of an original sample adult are in- cluded in the panel as long as they remain in that household. The basic SIPP design calls for members of each panel to be ir~ter- viewed at 4-month intervals over a period of 32 months for a total of eight 1 People who move to an address more than 100 miles from a SIPP primary sampling unit (PSU) area are not followed, although interviewers are instructed to conduct telephone inter- views with them if possible. Almost 97 percent of the U.S. population lived within 100 miles of the sample PSUs for the 1984 panel (Jabine, King, and Petroni, 1990:16). Attempts are made to keep track of people who enter institutions so that if they leave the institution at a later point during the life of the panel, they can be brought back into the panel.

OCR for page 91
SURVEY DESIGN 93 interview waves. (One-half of the 1984 SIPP panel was interviewed nine instead of eight times.) A new panel is introduced each year. To even out the interviewing workload, the sample for each panel is divided into four rotation groups, one of which is interviewed every month. Interviewing for the first 1984 SIPP panel began for the first rotation group in October 1983; interviewing for all subsequent panels has begun in February (see Commit- tee on National Statistics [1989:Table 2-1] for an illustration of the rotation group design). Each interview includes a set of core questions about in- come, program participation, and employment. In most cases, information is requested on these subjects for each of the 4 preceding months. Each interview also includes one or more modules on specific topics that are administered only once or twice in each panel. (See Tables 3-1, 3-2, and 3- 13 in Chapter 3 for information on the questionnaire content.) The sample design for SIPP is a multistage clustered probability sample of the population in the 50 states and the Distnct of Columbia that only excludes inmates of institutions and those members of the armed forces living on post without their families. There is currently no oversampling of specific population groups in SIPP, with one exception: the 1990 panel includes about 3,800 extra households continued from the 1989 panel, se- lected because they were headed by blacks, Hispanics, or female single parents at the first wave of the 1989 panel. The initial sample size for the first 1984 SIPP panel was about 21,000 eligible households, with the expectation that, by combining two panels of that size, users would be able to obtain a total sample size of about 37,000- 38,000 households.2 However, budget cuts necessitated an 18 percent re- duction in the sample size midway through the 1984 panel (beginning with wave 5~. Initial sample sizes for the 1985 through 1989 panels and the 1991 panel were only 12,500 to 14,500 eligible households (and the 1985 panel sample was further reduced beginning with wave 4~. The initial sample size for the 1990 panel was about 23,600 eligible households; how- ever, to fund this larger size, the Census Bureau had to terminate the 1988 and 1989 panels at six and three interviews, respectively. Budget cuts also necessitated limiting the 1986 and 1987 panels to seven rather than eight waves.3 The Census Bureau received sufficient funding for fiscal 1992 to enable it to return to the original SIPP design. The 1992 panel began in February 2Attrition reduces the number of actual cases that can be obtained by combining early waves of one panel with later waves of another, although new household formation by original sample members somewhat offsets this effect. 3A1SO, for other reasons, one rotation group in the 1985 and 1986 panels received one less wave than the other three groups (i.e., seven instead of eight waves in the 1985 panel and six instead of seven waves in the 1986 panel; see CNSTAT [1989:Table 2-1]).

OCR for page 91
94 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION with an estimated initial sample of 21,600 eligible households whose origi- nal sample members will be interviewed for eight waves. It is expected that subsequent panels will be funded at about the same level. User Views In considering whether to recommend any changes to the SIPP design, we consulted with researchers and policy analysts working in a range of rel- evant subject areas. We asked them to assess the usefulness of the data produced by the current SIPP design and to suggest design modifications that they thought would improve data quality and utility (see Chapter 2~. Virtually without exception, these SIPP data users indicated that the sample size per panel, particularly for panels with sample size reductions due to budget cutbacks, is too small to support analysis of many of the subgroups of most interest, such as participants in assistance programs. Users view the option of combining panels in order to increase sample size as cumber- some; moreover, combining panels is not an option for such uses as longitu- dinal analysis of a single panel or analysis of a variable topical module that was asked in only one panel. Users differ in their opinions on other major design elements, depend- ing on their interest in longitudinal or cross-sectional applications of the data. Users who value most the longitudinal information from SIPP support increasing the length of each panel to provide an improved capability to study transitions and spells of program participation and other behaviors. Longer panels would increase the sample size of events of interest, such as marital status or job changes or program exits and entrances, and would provide longer periods of observation before and after these events for analyzing their antecedents and consequences. Longer panels would also reduce the "right-censoring" problem, that is, the problem that the duration of some spells is not known because they are still in progress when the panel ends. Users most often suggest extending SIPP panels to 5 years, although some users would be satisfied with extending them to 4 years; at least one user has suggested lengthening SIPP panels to 10 years to permit the data to be used to study welfare dependency and persistent poverty (Manski, 1991~. In order to increase sample size and panel length, many users of the longitudinal data say they are willing to live with longer reference periods for each interview, thereby decreasing the number of interviews per year, typically from three 4-month to two 6-month waves. They are also quite willing to reduce the frequency with which new panels are introduced- perhaps introducing a new panel every 2 or 3 years instead of every year. Users who are more concerned about cross-sectional applications, such as describing the characteristics of program participants at any given time

OCR for page 91
SURVEY DESIGN 95 and estimating the likely effects of a program change using comparative static microsimulation modeling techniques,4 have a different viewpoint. These users are womed about proposals to reduce the frequency with which new panels are introduced because they assume that estimates based on a panel that has been in the field for longer than a year will exhibit higher levels of error than estimates based on a "fresh" panel. They are also loathe to increase the reference period of the interviews, assuming that longer recall periods will reduce the quality of the monthly data that are needed for program analysis. (Users who are concerned with fine-grained longitudinal analysis of program dynamics i.e., analysis of short spells and intrayear changes in participation and related charactenstics within the context of a longer panel-also share this view.) The views of Census Bureau staff have tended in the past to coincide with those of analysts who are most interested in cross-sectional applica- tions of SIPP. The original plans called for the Census Bureau to publish improved annual and subannual income statistics using core SIPP data. Prom this perspective, yearly refreshment of the sample appeared highly desir- able, as did short reference periods. However, for a variety of reasons (see further discussion below), the Bureau has yet to realize this goal. More recently, the Census Bureau staff have tended to emphasize the longitudinal uses of SIPP, arguing for continued use of the March CPS to provide basic annual income and poverty statistics (see Chapter 2~. Staff at the Census Bureau have also argued strongly for design features that they believe promote operational efficiency. Specifically, they have supported using monthly rotation groups in order to spread out the workload for the interviewers. Analysts, in contrast, find that the use of monthly rotation groups complicates data processing (see discussion in later sec- tion). Similarly, Bureau staff made the original decision to have reference periods of 4 months, instead of 6 or 3 months, as a compromise between the need for accurate monthly data and reduced cost of field operations. Selected Design Alternatives We could not investigate every design alternative. More important, while we felt it essential to look at designs that could improve the usefulness of 4Mierosimulation models of such programs as Aid to Families with Dependent Children (AFDC) and food stamps typically create an average monthly snapshot of the population, simulating program eligibility and parueipation under current program regulations and then simulating what the differences would be if program provisions were modified (e.g., if benefits were liberalized). Historically, these models have used the March CPS as their mierolevel database, employing information from such sources as the Income Survey Development Pro- gram (ISDP) and SIPP to allocate the annual CPS employment and income data to months. Several models of the food stamp program have been built directly from SIPP eross-seetional monthly data; see Citro and Hanushek (1991a, 1991b).

OCR for page 91
96 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION the survey for longitudinal applications, we did not want to consider alter- natives that undercut the uniqueness of SIPP: namely, that it is the only household survey that provides monthly data for fine-ginned analysis of changes in income and program dynamics on a short- to medium-term basis. Hence, we did not give serious attention to extending the panel length beyond 5 or 6 years nor the reference period length beyond 6 months at most.5 Other surveys, such as the Panel Study of Income Dynamics (PSID), will continue to serve users interested in analysis of longer term dynamics. Moreover, because of our conclusion that SIPP, not the March CPS, should serve as the primary source of the nation's income statistics, we did not believe it appropriate to consider alternatives that could seriously affect SIPP's ability to provide reliable cross-sectional estimates. Our concern that any design change not cause major problems for Census Bureau opera- tions also influenced our deliberations. Below we sketch in the basic features of five alternatives: the current (fully funded design and four designs intended to provide somewhat longer periods of observation with varying panel and reference period lengths and frequency with which new panels are introduced. For each design, we calculate the sample size per panel under the assumption of a fixed field budget that supports 160,000 interviews per year once a design is fully phased in.6 The total of 160,000 interviews per year is the number entailed by full implementation of the original SIPP design, that is, each year having a new panel that is interviewed three times, a panel in its second year that is interviewed three times, and a panel completing its term that is interviewed two times, with all panels having art original sample size of 20,000 eligible households. Note that none of the other designs has more than two panels in the field at the same time. 5However, in the section on sample design considerations, we discuss extending the length of SIPP panels for a longer period than whatever is the standard length for the full sample- for subgroups of interest as a means of adding sample size and longitudinal information for the subsampled groups. 6Attrition will reduce the number of required interviews: eligible households that do not respond in the first wave are dropped from the sample; eligible households that subsequently fail to respond are pursued for one more interview before being dropped. Formation of new households by original sample members will somewhat offset the effects of attntion. Also, at the first wave, an additional 4,000-5,000 visits are required to addresses that turn out to be vacant, demolished, or nonresidential (i.e., not eligible). Because of budget cuts, the Census Bureau has actually fielded no more than about 100,000-120,000 interviews in most years. Note that, for simplicity, we assume that interviews catty the same average cost under each design, that is, that the cost of a 6-month recall interview is the same on average as the cost of a 4-month interview. We also do not take into account any extra data collection costs that could result for longer panels from greater dispersion of the sample due to geographic mobility.

OCR for page 91
SURVEY DESIGN 97 Current Design Start a new panel every year; run each panel for 32 months and interview in 4-month waves, for a total of eight interviews. The sample size per panel is 20,000 originally eligible households. Alternative Design A Start a new panel every 2 years; run each panel for 4 years (48 months) and interview in 6-month waves, for a total of eight interviews (two per year). The sample size per panel is 40,000 originally eligible households. (Two interviews times two panels times 40,000 equals 160,000 interviews per year.) Alternative Design B Start a new panel every 2 years; run each panel for 4 years and interview in 4-month waves, for a total of 12 interviews (3 per year). The sample size per panel is 26,650 originally eligible house- holds. (Three interviews times two panels times 26,650 equals 160,000 interviews per year.) Alternative Design C Start a new panel every 2-1/2 years; run each panel for 5 years and interview in 6-month waves, for a total of 10 inter- views (2 per year). The sample size per panel is 40,000 originally eligible households. (Two interviews times two panels times 40,000 equals 160,000 interviews per year.) Alternative Design D Start a new panel every 3 years; run each panel for 6 years and interview in 6-month waves, for a total of 12 interviews (2 per year). The sample size per panel is 40,000 originally eligible house- holds. (Two interviews times two panels times 40,000 equals 160,000 inter- views per year.) We initially considered another very different design that strives to reconcile the widely voiced desire for larger sample size with the view that cross-sectional uses require short reference periods and frequently refreshed samples (Doyle, 1992~. In brief, this scheme would encompass two related kinds of surveys: (1) large, annual cross-section surveys, designed to ob- tain highly robust information for January of each year, and (2) small 2-year panels, introduced annually in midyear as subsets of the cross-sectional samples and designed to provide monthly information from six 4-month waves for limited analysis of program dynamics. More precisely, this design would do the following: start a new panel every year; field a large initial cross-section and interview once with a 1- month reference period; then, 6 months later (to allow time to draw the subsample), continue a subsample for 2 years, interviewing in 4-month waves, for a total of six interviews (three per year). The cross-section sample size is 55,000 eligible households and each panel subsample includes 17,500 originally eligible households. (55,000 plus three interviews times two panels times 17,500 equals 160,000 interviews per year.) To make the relatively small panels more useful for certain kinds of analysis, Doyle (1992) proposes to oversample a particular target group in each panel: for

OCR for page 91
98 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION example, oversample low-income people in one panel and higher income people the next. We early on determined that the costs of the Doyle design were likely to outweigh its possible benefits. As a practical concern, the Census Bureau would have to gear up each year for a very large cross-sectional survey and then scale down its operations to handle the much smaller panels. More- over, the cross-sectional survey component would provide estimates only for the month of January, while the panel survey component would provide longitudinal data only for 2 years for small samples.7 This design also introduces new panels on an annual basis a feature that we argue below is a major complication for SIPP data processing and use under the current design. Our discussion in the next section considers the likely effects that de- signs A-D would have on the quality and utility of SIPP data in comparison with the current design. Each design makes tradeoffs within a fixed field budget. For example, design A increases the sample size and overall length of each panel in comparison with the current design, but lengthens the reference period and reduces the frequency with which new panels are in- troduced. Designs C and D have 6-month reference periods like design A, but further lengthen each panel and reduce the frequency with which new panels are introduced. Design B retains the 4-month reference period of the current design, but provides fewer additional sample cases than the other designs. Our challenge was to assess the implications of these design choices for the "bottom line": the ability of SIPP to provide high-quality, relevant data for research and policy analysis related to income and program partici- pation. In considering alternative choices of panel length and number of inter- views, we focused on the implications for errors in panel survey estimates due to the following factors: attrition-or the cumulative loss from the sample over time of people who cannot be located or no longer want to participate, which can bias survey estimates and also reduce the sample size available for analysis; time-in-sample effecter changes in respondents' behavior or re- porting of their behavior due to their continued participation in the survey; and censoring of spells of program participation, poverty, and other be- haviors that is, the failure to observe the beginning and ending dates of all spells within the time span covered by the panel. (We also considered the implications of panel length for analysis of transitions and spells more gen- erally.) 7The proposed solution to the problem of small sample sizes, namely, to oversample differ- ent groups each year, would complicate the design and use of survey.

OCR for page 91
SURVEY DESIGN 99 In considering the choice of length of reference period, we focused on two kinds of errors: respondents' faulty recall, which is usually assumed to get worse as the period about which the respondent is queried is farther away; a related phenomenon known as the "seam" problem, whereby more changes (e.g., transitions in program participation or employment or changes in benefit amounts) are reported between months that span two interviews (e.g., the last month covered by wave 1 and the first month covered by wave 2) than are reported between months that lie entirely within the refer- ence period of one interview. In considering the choice of how often to introduce new panels, we looked at the possible reductions in error for cross-sectional estimates- reductions both in sampling error and in bias from attrition and time-in- sample effects afforded by the opportunity to use newer panels. We also looked at the negative effects of more frequent panels, one of which is a reduction in sample size available for longitudinal analysis of single panels. Negative effects can also stem from what we term the "complexity factor": specifically, having multiple panels in progress at the same time can in- crease the burden on interviewers and data processing operations, which, in turn, can introduce errors and reduce timeliness of data products. A com- plex design can also affect the costs to users of accessing and analyzing the data. Finally, given the importance of sample size to users, we considered the implications of alternative sample sizes for cross-sectional and longitu- dinal uses of the data. We attempted, whenever possible, to quantify the relationships of the venous design dimensions to the venous sources of error.8 Such quantifi- cation is highly desirable for making informed choices among design alter- natives. For example, in considering the optimum panel length and number of interviews, it is not enough to note that attrition bias and time-in-sample effects are assumed to worsen as a function of the number of interviews and also, perhaps, of the overall length of the panel, and that censoring is re- duced with an increase in panel length. One needs to know the relative size of these effects and their implications for important uses of the data. Unfor- tunately, the literature does not always provide clear guidance, and, ulti 80ther sources of nonsampling error appear related primarily to questionnaire design and data collection procedures and hence are not discussed here. They include undercoverage of population groups in the survey (see Chapter 7), nonresponse to specific questionnaire items that is not a function of length of recall, and reporting errors that are not a function of length of recall. Jabine, King, and Petroni (1990) provide an excellent review of the literature on sapling and nonsampling errors in SIPP. Other useful sources are Kalton, Kasprzyk, and McMillen (19891; Lepkowski, Kalton, and Kasprzyk (1990); and Marquis and Moore (1989, 1990a).

OCR for page 91
100 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION TABLE 4-1 Cumulative Household Noninterview and Sample Loss Rates, 1984-1988 and 1990 SIPP Panels (in percent) 1984 Panel Noninterview 1985 Panel Noninterview 1986 Panel Noninterview Wave Type A Type D Loss Type A Type D Loss Type A Type D Loss 1 4.9 4.9 6.7 6.7 7.3 7.3 2 8.3 1.0 9.4 8.5 2.1 10.8 10.8 1.5 13.4 3 10.2 1.9 12.3 10.2 2.7 13.2 12.6 2.3 15.2 4 12.1 2.9 15.4 12.4 3.4 16.3 13.8 3.0 17.1 5 13.4 3.5 17.4 14.0 4.1 18.8 15.2 3.7 19.3 6 14.9 4.1 19.4 14.2 4.8 19.7 15.2 4.3 20.0 7 15.6 4.9 21.0 14.4 5.2 20.5 15.3 4.8 20.7 8 15.8 5.7 22.0 14.4 5.5 20.8 9 15.8 5.7 22.3 NOTES: Differences in rates for the 1984 panel in comparison with subsequent panels may be due in part to differences in the sample design. Rates are not shown for the 1989 panel because it lasted only 3 waves. Type A noninterviews consist of households occupied by persons eligible for interview and for whom a questionnaire would have been filled if an interview had been obtained. Reasons for Type A Noninterview include: no one at home in spite of repeated visits, temporarily absent during the entire interview period, refusal, and unable to locate a sample unit. Type D noninterviews consist of households of original sample persons who are living at an unknown new address or at an address located more than 100 miles from a SIPP PSU and for whom a telephone interview is not conducted. mately, we have relied on our professional judgments in recommending design changes to SIPP. Attrition All household surveys are subject to unit nonresponse, that is, the failure to locate or obtain the cooperation of some fraction of the eligible households (or of individual members of otherwise cooperating households). Panel surveys are also subject to wave nonresponse, or attrition, at each succes- sive interview.9 9More precisely, total sample loss at each interview, or total wave nonresponse, includes attrition per se, that is, nonresponse by households that are never brought back into the survey, plus nonresponse of households that miss a wave but are successfully interviewed at the next wave. (In SIPP, households that miss two interviews in a row are dropped from the survey.) In addition, in every SIPP interview, there are "Type Z" nonrespondents, that is, individual members of otherwise cooperating households for whom no information is obtained, either in person or by proxy.

OCR for page 91
SURVEY DESIGN 101 1987 Panel Noninterview 1988 Panel Noninterview 1990 Panel Noninterview Type A Type D Loss Type A Type D Loss Type A Type D Loss 6.7 6.7 7.S - 7.5 7.1 7.1 11.1 1.5 12.6 11.4 1.5 13.1 10.9 1.5 12.6 1 1.5 2.6 14.2 12.0 2.3 14.7 1 1.5 2.5 14 4 12.3 3.3 15.9 13.0 3.0 16.5 12.6 3.3 16.5 13.7 4.1 18.1 13.9 3.3 17.8 13.7 4.5 18.9 13.6 4.9 18.9 13.6 4.0 18.3 14.1 5.2 20.1 13.6 4.9 19.0 14.3 5.8 21.0 N.A N.A N.A The sample loss rate consists of cumulative noninterview rates adjusted for unobserved growth in the Noninterview units (created by splits). aRates for 1990 are for the nationally representative portion of the sample; they exclude the households that were continued from the 1989 panel. N.A., Not available. SOURCE: Data from Jabine, King, and Petroni (1990:Table 5.1) and unpublished tables from the Census Bureau. Attrition reduces the number of cases available for analysis including the number available for longitudinal analysis over all or part of the time span of a panel and the number available for cross-sechonal analysis from later interview waves and thereby increases the sampling error or variance of the estimates. More important, people who drop out may differ from those who remain in the survey. To the extent that adjustments to the weights for survey respondents do not compensate for these differences, estimates from the survey may be biased. Evidence on Attrition To date, the wave nonresponse rates from SIPP show a definite pattern (see Table 4-1~. Total sample loss in the 1984-1988 and 1990 panels is highest at the first and second interviews 5-8 percent of eligible households at wave 1 and an additional 4-6 percent of eligible households at wave 2. Thereafter, the additional loss is only 2-3 percent in each of waves 3-5 and less than 1 percent in each subsequent wave.~ By iOIndeed, looking closely at later panels in comparison with earlier ones, the numbers sug- gest that SIPP interviewers are experiencing somewhat less success in obtaining responses from households in waves 1 and 2 of later panels but better success in retaining cooperative households for subsequent waves of later panels.

OCR for page 91
120 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION The first stage in the sampling process for SIPP (as for the March CPS and other household surveys conducted by the Census Bureau) is to use decennial census data to divide the entire United States into primary sam- pling units (PSUs) of larger counties and independent cities and groups of smaller counties. The larger PSUs are then selected with certainty for the sample; smaller PSUs are grouped into strata and subsampled (174 PSUs were selected for the 1984 SIPP panel and 230 for subsequent panels). The final stages in the sampling process are to obtain addresses in each sampled PSU and select clusters of two to four households for interviewing. The addresses represent a combination of decennial census addresses and ad- dresses that are obtained through field canvasses. The latter include ad- dresses in areas of new housing construction and in areas for which the census address list was incomplete. The 1970 and 1980 censuses formed the basis of the sample design and selection of census addresses for the 1984 and 1985-1994 SIPP panels, respectively (see Jabine, King, and Petroni [l990:Ch. 3] for additional information). The Census Bureau is currently developing a new sample design for SIPP, based on the 1990 census, that will be implemented beginning with the 1995 panel. The necessary research has been completed to identify and select the PSUs, and work is proceeding on other aspects of implementation. A new feature of the design will be a provision to oversimple low- income households (see Singh, 1991~. This change is at the behest of SIPP users. In 1988-1989, the Census Bureau held several meetings with data users who were concerned about the effects of sample size reductions in SIPP due to budget cuts. Users expressed an interest in a larger sample size for a number of subgroups' including (in priority order) low-income people, the elderly, blacks, Hispanics, and the disabled. Several options for oversampling were discussed. Given budget constraints, it became apparent that it would be extremely difficult to implement an oversampling scheme in SIPP prior to the 1995 redesign. To help users in the meantime, the Census Bureau decided to curtail the 1988 and 1989 panels (to six and three waves, respectively), in order to have funds to field a larger sample for the 1990 panel, including a supplemental sample that was continued from the 1989 panel (see section above on the current SIPP design). We generally support the goal of oversampling low-income groups in SIPP, which accords with the survey's focus on people who are economi- cally at risk However, we believe that the Census Bureau's scheme for the 1995 redesign (see below) is not likely to be as effective as it is projected to be in achieving this goal. We present several alternative means of oversampling that we believe the Census Bureau should explore.

OCR for page 91
SURVEY DESIGN 121 Using the 1990 Census for Oversampling in SIPP In planning for the 1995 sample redesign, Census Bureau staff conducted research on methods for obtaining a larger sample for the low-income popu- lation, defined as households with annual income below 150 percent of the poverty threshold. The research also investigated ways to minimize the increase in the variance of estimates for people aged 55 and older that would be expected to result from oversampling the poor and near-poor (given the lower poverty rate for older than for younger people). The Census Bureau decided to adopt a methodology from-Waksberg (1973), which creates two strata within each PSU. The first stratum has a high concentration of the group of interest and is oversampled relative to the second stratum, which has a low concentration of the group of interest. For the SIPP redesign, the 1990 census address list within each PSU will be divided into strata of low-income and higher income households. For households in the 1990 census list that answered the long-form questionnaire (about one-sixth of the total), the determination of income above or below 150 percent of poverty will be made directly. For households that answered the short forte, proxy characteristics will be used to make the classification: specifically, the low-income stratum will include female-headed households with children under 18; low-rent households in central cities of metropoli- tan statistical areas; black and Hispanic households in central cities; and black and Hispanic households in which the head is under age 18 or over age 64. For those blocks of the PSU for which there is no complete census address list (the area frame portion of the sample), the classification will be made using aggregate census information on the proportion of the popula- tion below 150 percent of poverty in each block. The low-income and higher income strata in each PSU will be sampled at higher and lower rates so that an oversample of households in the low-income strata is obtained. The extent of oversampling will be restricted by the requirement that the sampling error of estimates for persons aged 55 and older not increase by more than 5 percent. When the new sample design is introduced in 1995, it is expected that the census address portion of the sample will constitute about 70 percent of the total and the area frame portion about 20 percent. The remaining 10 percent will represent addresses of new construction, for which no oversampling will be performed; obviously, over the course of a 10-year period, this category will grow as a proportion of the total. Moreover, one can confi- dently expect that the efficiency of the design will decline from what would have obtained in 1990 because of the mobility of the population: for ex- nmple, by 1995, 1998, or 2003, a low-income household may occupy a sample address that was drawn from the higher income stratum and vice

OCR for page 91
122 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION versa. Also, even when the same household is present in 1995 or 1998 as in 1990, the household may have changed classification from low to higher income or vice versa. The question is how great a deterioration in the efficiency of the design will occur over time. The Census Bureau conducted research with data from the 1980 census for 27 PSUs to determine the extent of the gains that could be expected from oversampling low-income households in the 1990 census address por- tion of the sample, assuming that the design was implemented immediately after the census. The results (Singh, 1991:Table 1) showed gains (i.e., decreases in sampling error) for many subgroups of interest to users, such as poor blacks and Hispanics. The Census Bureau also conducted research with data from the American Housing Survey (AHS) on the effects of time on the efficiency of the design and estimated very little increase in sampling error 5 to 15 years after the census date (Singh, l991:Table 3~. The Bureau estimated somewhat higher but still relatively low increases in sampling error due to uniform sampling in the new construction frame and the assump- tion that stratification will be less efficient at the block level in the area frame compared with the census address frame (Singh, l991:Table 4) 27 Although this research appears encouraging about the proposed oversampling scheme, we remain skeptical. There were many limitations to the research, such as the use of only 27 PSUs from just a few states in the 1980 census analysis and the inability of the analysts to replicate fully the proposed design with the AHS data (households were classified on the basis of proxy characteristics rather than on the basis of their income-to-poverty ratio). We believe that further research on the extent to which the household pov- erty classification assigned to an address in the census predicts the poverty classification of the household at that address ~ to 15 years later is needed to support the Census Bureau's proposed oversampling scheme. For ex- ample, it could be useful to conduct research on the extent to which the household poverty classification of addresses in the 1985 SIPP panel corre- sponds to the 1980 census classification.28 There is also no opportunity to change any aspect of the design because the Census Bureau plans to draw 10 years' worth of sample for SIPP (and other household surveys) at the same time. Hence, the samples for all of 27Chu et al. (1989:2.9-2.11) found that oversampling geographic areas with relatively high percentages of low-income households was not very successful in reducing the sampling errors for estimates of the poverty population in the National Health and Nutrition Examination Surveys. They attributed this outcome to the fact that many poor people live in nonpoverty areas and vice versa. 28We understand that a match of 1985 SIPP and 1980 census address lists is not likely to be operationally feasible, and we strongly urge the Bureau to take steps to ensure that it will be possible to perform a match of 1995 SIPP and 1990 census lists.

OCR for page 91
SURVEY DESIGN 123 the panels from 1995 to 2005 will be drawn in the same way, using the same characteristics to determine the two strata within each PSU. The only exception is that provision has been made to jettison the oversampling and implement a uniform sampling rate for any SIPP panel in the 1995-2005 period if that is later viewed as desirable. Even assuming the benefits of the proposed scheme, we believe that there are some technical ways in which it could be improved. For example, if the object is to oversample low-income households in SIPP, then the census address portion of the sample could be drawn exclusively from the long-form respondents to the 1990 census, which represent a-very large fraction (1 in 6) of the total population. Selection of PSUs on the basis of poverty-related characteristics could also be beneficial. We are pleased that the Census Bureau decided to adopt the same oversampling rate across all PSUs, instead of determining PSU-specific rates as in the original plan that we reviewed. The latter procedure would have allowed the Census Bureau to better control the size of the workload across PSUs, but it would have resulted in variations in the weights for addresses sampled within each stratum low-income or higher income-across PSUs. (Such weight varia- tions are likely to reduce the sampling error gains.) Also, those PSUs with the highest percentages of low-income households would have had propor- tionately less oversampling of the low-income stratum compared with wealthier PSUs. More broadly, we urge the Census Bureau (and the user community) to be clear about the target population in considering the use of oversampling in SIPP. For the redesign, the Census Bureau is essentially defining a cohort of low-income people on the basis of their previous year's household income-to-poverty ratio. However, many people with low incomes at wave 1 will move into a higher income category over the life of a SIPP panel and vice versa (see Short and Littman, 1990; Short and Shea, 1991~. Instead of a larger sample for a low-income cohort at the start of a panel, it may be that users would prefer to have a larger sample for people who are at risk of experiencing a spell of low income at any time during a panel or at risk of experiencing a long spell of low income. Different oversampling criteria would be required, depending on the definition of the target population: for example, a combination of variables, such as family type, ethnicity, and previous year's low-income status, may be a better predictor of long-term economic disadvantage than the latter variable alone. Screening as Another Method of Oversampling An alternative method for oversampling the low-income population in SIPP is to use a screening interview close to the time when a new panel is to be introduced. This approach could be used to refine the proposed 1990 cen

OCR for page 91
24 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION sue-based approach (if larger-than-needed samples were drawn from the census list) or serve as a substitute for it. The advantage of screening is that it provides information on which to draw a sample that is close in time to the introduction of the survey and thereby is likely to permit more effective oversampling since much less mobility or change in classification will have occurred in the interim. Also, screening offers flexibility the criteria for sampling can be changed as needed (e.g., some panels could oversample minorities instead of low-in- come households). In addition, screening can be applied uniformly to the entire sample, instead of using different procedures for the census long- fonn respondents, census short-form respondents, area frame address, and new construction address segments of the sample. In the context of oversampling low-income households while not worsening the estimates for older people, screening should make it possible to develop a more efficient approach to this problem (e.g., also oversampling elderly higher income households). On the negative side, screening imposes the costs of conducting an interview for a larger number of households than will be selected for the survey, which may necessitate a reduction in the overall sample size. It may also add costs by lessening the ability of the Census Bureau to equalize interviewers' workloads across PSUs (e.g., the screening might result in sample sizes that overtax some interviewers while underemploying others, with little time to make adjustments before the start of the survey). However, these costs must be viewed in the context of the entire sur- vey, which, in the case of SIPP, will amount to 12 interviews under the proposed redesign. There are also ways to reduce costs. It may be possible to conduct much of the screening using a centralized CATI system that eliminates interviewer travel costs.29 Another way to reduce costs is to treat the screening interview as wave 1 of a SIPP panel instead of as an added interview. In a CAPI environment, the sampling criteria could be built into the interview so that the full wave 1 interview could be adminis- tered on the spot to those households selected for the sample. Another problem with screening, when the purpose is to identify house- holds on the basis of income or poverty, concerns measurement error. Screening interviews are typically short in order to reduce costs. However, studies have shown that respondents tend to underestimate their income in response to brief, general questions (e.g., Chu et al., 1989; Moeller and Mathiowetz, 1990), so that a short screening questionnaire may erroneously classify higher income households as low income, thereby reducing the gains from the oversampling. In addition, some low-income households may be falsely 29Telephone mlmbers would be obtained from directories that are organized by address. Some personal screening would also be required for addresses with no telephones or with an unlisted number.

OCR for page 91
SURVEY DESIGN 125 classified as higher income based on their responses to the screening ques- tionnaire. Such households will thus not be oversampled. There is also the issue, noted above, of defining the target population-for example, people at risk of a long-term spell of low-income during a SIPP panel rather than a low-income cohort and defining appropriate variables to use in the screen- ing questionnaire. Despite the problems of a screening approach, we be- lieve that the potential benefits in terms of a more efficient sample design and greater flexibility merit a careful examination of its cost-effectiveness for oversampling the low-income population in SIPP. Recommendation 4-4: The Census Bureau should investigate alternative methods of oversampling the low-income population in SIPP, including the use of screening interviews as ~ possible complement to or substitute for an approach based on using information from the 1990 census. Increasing Sample Size by Extending Panel Length The Census Bureau's census-based plan and the use of screening do not exhaust the possible approaches for oversampling low-income households or other subgroups in SIPP. Another possibility is to extend the length of one or more SIPP panels for subgroups of interest. This strategy both provides additional longitudinal information for the subsampled cases and makes it possible to treat them as an addition to the sample for the next panel (see David, 1985a). This approach was followed in the 1990 panel, for which the sample includes households from the 1989 panel that were headed by blacks, Hispanics, and female single parents as of wave 1 of that panel. Users have often expressed interest in periodically extending the length of SIPP panels for people who may be at economic risk because of experi- encing a divorce or job loss or for people who benefit from programs or have certain demographic characteristics (e.g., single parents). We are not now recommending that such an approach be built into SIPP because we believe that the Census Bureau confronts a very large agenda in implement- ing the proposed redesign of 4-year panels introduced every 2 years to- gether with computer-assisted interviewing and an improved database man- agement system (see Chapter 5~. However, we do believe that the concept has merit and should be an option for the future. Hence, we urge the Census Bureau to take the steps that are necessary to permit this and related options to be considered for SIPP at some future date. (A related option, which would add longitudinal information although not necessarily increase the sample size for the next SIPP panel, would be to return at annual or longer intervals to selected cases.)

OCR for page 91
26 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION One such step involves informed consent. Respondents need to be informed at the outset that there is a possibility that they may be asked to answer further questions after the closeout of their SIPP panel. If such consent is not sought, then, under current views about the obligations of statistical agencies to their respondents, it would probably not be possible to later impose this additional burden on them. Another step involves setting in place procedures for tracking respon- dents after the end of a panel. Particularly if it appears desirable to revisit the subsampled groups at less frequent intervals (e.g., yearly), the Census Bureau will need to have good procedures developed to keep in touch with them so as to minimize sample loss. Recommendation 4-~: The Census Bureau should take steps to ensure that it will be possible to extend the length of SIPP pan- els for selected subgroups of interest or to follow them up at a later date, should such options be desired to obtain increased sample size and longitudinal information. Multiple-Frame Samples Yet another way to obtain an additional sample for subgroups of interest in SIPP is to develop multiple-frame samples, that is, samples of households together with cases that are drawn from one or more types of administrative records for example, program records, tax records, or employer records.30 Augmenting a household sample with cases from administrative records can offer considerable benefits. First of all, such a strategy may be a very efficient means of oversampling such subgroups as program recipients. Also, providing that confidentiality and data access issues are resolved, additional records data could be obtained for the administrative cases, not only con- current with but also preceding and following the time span of the survey interviews. Analysis of the relationships of the records data and survey responses for the administrative cases could serve a number of useful pur- poses. For example, in a multiple-frame sample of program recipients, the records information could provide the basis for imputing program character- istics to recipients in the household sample for use in improved policy models for program analysis and simulation of program changes. The drawback to the multiple-frame approach for increasing sample size and information for subgroups of interest is that a number of problems impede its ready implementation. Many of the problems are operational in 30A sample of households together with cases drawn from one source of records is termed a dual-frame sample.

OCR for page 91
SURVEY DESIGN 127 nature.3i Permission for the records must be obtained, which can be time- consuming and difficult to achieve. In the case of programs that are state administered (e.g., AFDC and food stamps), there are differences across states in access rules and in the extent to which the records are appropri- ately computerized so that access is operationally feasible. At this time, only a small number of states have good computerized records for such programs as AFDC; hence, it would not be possible to develop a national multiple-frame sample for these programs. Also, the Census Bureau itself may not be able to have access to an entire administrative file for purposes of sample selection, in which case it would have to rely on the responsible agency's ability to properly implement a specified sampling procedure. Fi- nally, the addresses in the sampled records may not be current, in which case a tracing operation, with likely problems of its own, would be neces- sary (see Logan, Kasprzyk, and Cavanaugh, 1988~. A multiple-frame approach poses technical difficulties as well, includ- ing the determination of appropriate weights for the combined sample. Tak- ing the simple case of a dual-frame sample, some fraction of the household sample will have a probability of selection in the sample drawn from records, and all members of the records sample will have a probability of selection for the household sample. Consequently, it is necessary to develop weight- ing adjustments to compensate for dual selection probabilities, and this requires identifying those members of the household sample who are in- cluded in the administrative frame. One way to identify these members is to rely on respondents' reports of their status at the time of drawing the administrative sample. (For example, in the case of a dual-frame sample including SSI cases drawn the August before the start of a SIPP panel, the questionnaire would ask about receipt of SSI in the preceding August.) A possibly more reliable approach is to match the household sample members with the administrative frame. However, this procedure adds a step to the data processing that could cause delays in the release of data files and reports. (The 1979 ISDP multiple-frame sample initiative came to grief on this very point usable, fully weighted data files were not completed before the program ended; see Kasprzyk [19833.) In addition, there are technical issues to resolve with regard to the type of sample to draw from administrative records (assuming that the sample can be properly selected.) In the case of a program such as AFDC, one needs to decide whether a cross-section sample or a sample of new entrants is appropriate. A cross-sectional sample will overrepresent longer term 31Record-check studies, including forward record checks and full record checks, face many of the some operational problems: see, for example, the discussion in Marquis and Moore (199Ob) of the difficulties in carrying out the SIPP record-check study, which obtained records for eight federal and state programs.

OCR for page 91
28 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION recipients. If new entrants are sampled, the decision must be made as to the time period or window during which cases are eligible for selection (a month, year, or other period). Ideally, for program analysis, one would like to sample people who are eligible for Programs. not lust Participants, but an-- r--r~ -I---- --- I-- ~ ---- J--- r there is no record system available to do this. The usefulness of a multiple-frame or dual-frame sample for SIPP de- pends very much on user interest in particular population groups, such as program recipients. We suggest that a decision to adopt this means of oversampling, particularly in light of the operational and technical difficul- ties it would pose, should be contingent on the support and cooperation of an interested agency. We encourage the Census Bureau staff to keep up to date on the methodology of multiple-frame samples, so that SIPP can be responsive to requests from agencies that want to obtain a larger sample size and information for a particular population by adding a component to the SIPP sample that is drawn from their records.32 FOLLOWING RULES At present, SIPP follows original sample adults that is, all people aged 15 and older who resided in an interviewed household in wave 1 for the life of a panel or until they leave the universe or drop out of the survey. SIPP also keeps track of original sample adults who enter institutions and inter- views them if they return to a household setting.33 Adults who join the household of an original sample adult after wave 1 are followed only so long as they continue to reside with an original sample member. Similarly, children, regardless of whether they were present in wave 1, are followed only so long as they reside with an original sample adult. We believe that the utility of SIPP for important research and policy concerns would be enhanced by changing the following rules for two groups all children and children and adults who enter institutions. Many users expressed great interest in having more information with which to analyze children's changing circumstances (see Chapter 2~. Fewer arid fewer children have a stable family or economic situation throughout their childhood, and more and more children are experiencing economic and social distress. Extending the length of SIPP panels will make the survey 32The report of the ISDP special frames study (Logan, Kasprzyk, and Cavanaugh, 1988), which was designed to test the feasibility of sampling and locating respondents from adminis- trative records, provides pertinent information for consideration of multiple-frame samples in SIPP; see also Kasprzyk (1983). 33Tracking the institutionalized was not originally intended for SIPP but was initiated in May 1985 for the 1984 panel and in October 1985 for the 1985 panel (Jean and McArthur, 1987).

OCR for page 91
SURVEY DESIGN 129 more useful for analyzing children's circumstances over time. However, the current following rules preclude using the survey to obtain a complete picture of children's family dynamics. For example, in wave 1 of the 1984 panel, 24 percent of children under 15 lived with only one parent or with another relative (McArthur, Short, and Bianchi, 1986:Table 6~. Any of these children who subsequently moved in with an adult not part of the original sample (e.g., the other parent or another relative) would not be tracked under the current following rules. Similarly lost to follow-up would be children who went to live with, say, a grandparent following a divorce, or children born to a marriage between a sample and nonsample-adult who stayed with the nonsample parent following a marital separation.34 The numbers of such events will increase with longer panels, but the current following rules will preclude analysis of them. We urge the Census Bureau to treat all children present in interviewed households at wave 1 together with children born subsequently to original sample mothers as original sample members who are followed throughout the life of a panel. When original sample children move into a household of nonsample members, information would be obtained about them and about other members of their new house- hold. There is also user interest in learning about both children and adults who become institutionalized. SIPP is not the appropriate vehicle to pro- vide information about the institutionalized population as such, but because the survey follows people over time, it naturally provides a sample of en- trants to all types of institutions (e.g., mental health treatment facilities, nursing homes, prisons). Extending the current practice of following origi- nal sample members who enter institutions to include children and, in addi- tion, obtaining at least some limited information for them would enhance the usefulness of SIPP for analysis of socioeconomic well-being in the United States. For example, for fuller analysis of some government pro- grams, it is important to include institutionalized people who can still re- ceive benefits under such programs as social security and SSI.3s It would also be useful to know about other sources of income for institutionalized people, such as private pensions, asset income, and transfers from relatives. 340f children present in all 8 waves of the 1984 SIPP panel who lived with both parents in wave 1, 7 percent had experienced a change in the marital status of their parents by the end of the panel (Bianchi and McArthur, l991:Table A). This is likely a lower bound estimate of children at risk of a marital separation to the extent that the weights do not adequately adjust for the higher attrition rates of children in nonintact families (McArthur, Short, and Bianchi, 1986:Table 6). 35Coder (1988:Table 9) found that 4 percent of SSI recipients in the first month of the 1984 SIPP panel had entered an institution by the end of the panel, as was also true for 3 percent of recipients of social security and veterans' payments and 1 percent of private pension recipi ents. l

OCR for page 91
130 THE SURVEY OF INCOA'IE AND PROGRAM PARTICIPATION These data would be useful for analysis in their own right and also in conjunction with the data for the people in the household they left. We do not offer detailed suggestions about the kinds of data to collect for original sample members who enter institutions during the course of a SIPP panel, nor even about the preferred data collection mode (e.g., some items might best be obtained from the institution and others from the family). However, we do urge the Census Bureau to investigate the needs of users for information about institutionalized persons that fall within SIPP's goal of pro- viding improved data on economic resources arid assistance programs, particu- larly for people and families who may be economically at risk. Recommendation 4-6: SIPP panels should treat all children who reside in interviewed households at the first wave and also chil- dren born during the course of a panel to original sample moth- ers as original sample members, who are followed if they move into households without an original sample adult. SIPP panels should also continue to follow and collect data for both original sample adults and children if they move into institutions.