Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
4 Survey Design In this chapter we review and compare the current SIPP design and avail- able alternatives in light of our recommended goals for the survey. In Chapter 5 we do the same for the SIPP data collection and processing system. From both reviews we conclude that changes in the design and operation of SIPP would enhance the utility of the data and increase the cost-effectiveness of the SIPP program. MAJOR DESIGN ELEMENTS AND ALTERNATIVES The design of a continuing panel survey such as SIPP includes several components, each of which affects the quality and utility of the data and the costs of data collection, processing, and use. In this section we consider the following major design elements: · the number of interviews or waves in each panel; · the length of the reference period covered by each interview; · the length of each panel (a function of the number of interviews and the reference period length); · the frequency with which new panels are introduced; and · the total initial sample size for each panel. We also consider the advantages and disadvantages of spreading out the workload by interviewing portions of the sample (called rotation groups) each month rather than interviewing the entire sample at the sense time for 91
92 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION each wave. In the last two sections we consider aspects of the SIPP sample design, namely, the use of oversampling to increase the sample size for the low-income population and the rules for following people that detains who is included in the sample for each panel over time. The major design components listed above cannot be assessed in isola- tion. They interact In a number of ways. Given a fixed budget that puts a ceiling on the number of interviews that can be fielded each year, a change in one of the design elements will generally necessitate an offsetting change elsewhere. For example, an increase in panel length must be offset by one or more of the following changes: a reduction in the frequency with which new panels are introduced, a reduction in the sample size per panel, or an increase in the reference period length for each interview wave. Current SIPP Design SIPP is a true panel survey, in that it follows individual people-including those who change their address in contrast to quasi panel surveys, such as the Current Population Survey (CPS) and Consumer Expenditure Survey (CEX), which return to the same address and interview the people who currently reside there. To obtain the sample for each SIPP panel, a list of addresses is designated for interviewers to visit in the first wave. Typically, about 75-80 percent of the addresses represent occupied housing units whose occupants are eligible for the survey; the rest are vacant, demolished, or nonresidential units. Of the eligible households, 92-95 percent of the resi- dents usually agree to participate in the survey (Bowie, 1991~. The adult members of these households (people aged 15 and over) are deemed origi- nal sample members. Each of them is followed until the end of the panel or until the person leaves the universe (e.g., by dying, entering an institution, or moving abroad) or the sample (e.g., by refusing to continue to be inter- viewed, moving to an unknown address, or moving outside the area covered by the SIPP interviewing staffs ). Children of original sample members are followed as long as they reside with an original sample adult, and adults and children who join the household of an original sample adult are in- cluded in the panel as long as they remain in that household. The basic SIPP design calls for members of each panel to be ir~ter- viewed at 4-month intervals over a period of 32 months for a total of eight 1 People who move to an address more than 100 miles from a SIPP primary sampling unit (PSU) area are not followed, although interviewers are instructed to conduct telephone inter- views with them if possible. Almost 97 percent of the U.S. population lived within 100 miles of the sample PSUs for the 1984 panel (Jabine, King, and Petroni, 1990:16). Attempts are made to keep track of people who enter institutions so that if they leave the institution at a later point during the life of the panel, they can be brought back into the panel.
SURVEY DESIGN 93 interview waves. (One-half of the 1984 SIPP panel was interviewed nine instead of eight times.) A new panel is introduced each year. To even out the interviewing workload, the sample for each panel is divided into four rotation groups, one of which is interviewed every month. Interviewing for the first 1984 SIPP panel began for the first rotation group in October 1983; interviewing for all subsequent panels has begun in February (see Commit- tee on National Statistics [1989:Table 2-1] for an illustration of the rotation group design). Each interview includes a set of core questions about in- come, program participation, and employment. In most cases, information is requested on these subjects for each of the 4 preceding months. Each interview also includes one or more modules on specific topics that are administered only once or twice in each panel. (See Tables 3-1, 3-2, and 3- 13 in Chapter 3 for information on the questionnaire content.) The sample design for SIPP is a multistage clustered probability sample of the population in the 50 states and the Distnct of Columbia that only excludes inmates of institutions and those members of the armed forces living on post without their families. There is currently no oversampling of specific population groups in SIPP, with one exception: the 1990 panel includes about 3,800 extra households continued from the 1989 panel, se- lected because they were headed by blacks, Hispanics, or female single parents at the first wave of the 1989 panel. The initial sample size for the first 1984 SIPP panel was about 21,000 eligible households, with the expectation that, by combining two panels of that size, users would be able to obtain a total sample size of about 37,000- 38,000 households.2 However, budget cuts necessitated an 18 percent re- duction in the sample size midway through the 1984 panel (beginning with wave 5~. Initial sample sizes for the 1985 through 1989 panels and the 1991 panel were only 12,500 to 14,500 eligible households (and the 1985 panel sample was further reduced beginning with wave 4~. The initial sample size for the 1990 panel was about 23,600 eligible households; how- ever, to fund this larger size, the Census Bureau had to terminate the 1988 and 1989 panels at six and three interviews, respectively. Budget cuts also necessitated limiting the 1986 and 1987 panels to seven rather than eight waves.3 The Census Bureau received sufficient funding for fiscal 1992 to enable it to return to the original SIPP design. The 1992 panel began in February 2Attrition reduces the number of actual cases that can be obtained by combining early waves of one panel with later waves of another, although new household formation by original sample members somewhat offsets this effect. 3A1SO, for other reasons, one rotation group in the 1985 and 1986 panels received one less wave than the other three groups (i.e., seven instead of eight waves in the 1985 panel and six instead of seven waves in the 1986 panel; see CNSTAT [1989:Table 2-1]).
94 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION with an estimated initial sample of 21,600 eligible households whose origi- nal sample members will be interviewed for eight waves. It is expected that subsequent panels will be funded at about the same level. User Views In considering whether to recommend any changes to the SIPP design, we consulted with researchers and policy analysts working in a range of rel- evant subject areas. We asked them to assess the usefulness of the data produced by the current SIPP design and to suggest design modifications that they thought would improve data quality and utility (see Chapter 2~. Virtually without exception, these SIPP data users indicated that the sample size per panel, particularly for panels with sample size reductions due to budget cutbacks, is too small to support analysis of many of the subgroups of most interest, such as participants in assistance programs. Users view the option of combining panels in order to increase sample size as cumber- some; moreover, combining panels is not an option for such uses as longitu- dinal analysis of a single panel or analysis of a variable topical module that was asked in only one panel. Users differ in their opinions on other major design elements, depend- ing on their interest in longitudinal or cross-sectional applications of the data. Users who value most the longitudinal information from SIPP support increasing the length of each panel to provide an improved capability to study transitions and spells of program participation and other behaviors. Longer panels would increase the sample size of events of interest, such as marital status or job changes or program exits and entrances, and would provide longer periods of observation before and after these events for analyzing their antecedents and consequences. Longer panels would also reduce the "right-censoring" problem, that is, the problem that the duration of some spells is not known because they are still in progress when the panel ends. Users most often suggest extending SIPP panels to 5 years, although some users would be satisfied with extending them to 4 years; at least one user has suggested lengthening SIPP panels to 10 years to permit the data to be used to study welfare dependency and persistent poverty (Manski, 1991~. In order to increase sample size and panel length, many users of the longitudinal data say they are willing to live with longer reference periods for each interview, thereby decreasing the number of interviews per year, typically from three 4-month to two 6-month waves. They are also quite willing to reduce the frequency with which new panels are introduced- perhaps introducing a new panel every 2 or 3 years instead of every year. Users who are more concerned about cross-sectional applications, such as describing the characteristics of program participants at any given time
SURVEY DESIGN 95 and estimating the likely effects of a program change using comparative static microsimulation modeling techniques,4 have a different viewpoint. These users are womed about proposals to reduce the frequency with which new panels are introduced because they assume that estimates based on a panel that has been in the field for longer than a year will exhibit higher levels of error than estimates based on a "fresh" panel. They are also loathe to increase the reference period of the interviews, assuming that longer recall periods will reduce the quality of the monthly data that are needed for program analysis. (Users who are concerned with fine-grained longitudinal analysis of program dynamics i.e., analysis of short spells and intrayear changes in participation and related charactenstics within the context of a longer panel-also share this view.) The views of Census Bureau staff have tended in the past to coincide with those of analysts who are most interested in cross-sectional applica- tions of SIPP. The original plans called for the Census Bureau to publish improved annual and subannual income statistics using core SIPP data. Prom this perspective, yearly refreshment of the sample appeared highly desir- able, as did short reference periods. However, for a variety of reasons (see further discussion below), the Bureau has yet to realize this goal. More recently, the Census Bureau staff have tended to emphasize the longitudinal uses of SIPP, arguing for continued use of the March CPS to provide basic annual income and poverty statistics (see Chapter 2~. Staff at the Census Bureau have also argued strongly for design features that they believe promote operational efficiency. Specifically, they have supported using monthly rotation groups in order to spread out the workload for the interviewers. Analysts, in contrast, find that the use of monthly rotation groups complicates data processing (see discussion in later sec- tion). Similarly, Bureau staff made the original decision to have reference periods of 4 months, instead of 6 or 3 months, as a compromise between the need for accurate monthly data and reduced cost of field operations. Selected Design Alternatives We could not investigate every design alternative. More important, while we felt it essential to look at designs that could improve the usefulness of 4Mierosimulation models of such programs as Aid to Families with Dependent Children (AFDC) and food stamps typically create an average monthly snapshot of the population, simulating program eligibility and parueipation under current program regulations and then simulating what the differences would be if program provisions were modified (e.g., if benefits were liberalized). Historically, these models have used the March CPS as their mierolevel database, employing information from such sources as the Income Survey Development Pro- gram (ISDP) and SIPP to allocate the annual CPS employment and income data to months. Several models of the food stamp program have been built directly from SIPP eross-seetional monthly data; see Citro and Hanushek (1991a, 1991b).
96 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION the survey for longitudinal applications, we did not want to consider alter- natives that undercut the uniqueness of SIPP: namely, that it is the only household survey that provides monthly data for fine-ginned analysis of changes in income and program dynamics on a short- to medium-term basis. Hence, we did not give serious attention to extending the panel length beyond 5 or 6 years nor the reference period length beyond 6 months at most.5 Other surveys, such as the Panel Study of Income Dynamics (PSID), will continue to serve users interested in analysis of longer term dynamics. Moreover, because of our conclusion that SIPP, not the March CPS, should serve as the primary source of the nation's income statistics, we did not believe it appropriate to consider alternatives that could seriously affect SIPP's ability to provide reliable cross-sectional estimates. Our concern that any design change not cause major problems for Census Bureau opera- tions also influenced our deliberations. Below we sketch in the basic features of five alternatives: the current (fully funded design and four designs intended to provide somewhat longer periods of observation with varying panel and reference period lengths and frequency with which new panels are introduced. For each design, we calculate the sample size per panel under the assumption of a fixed field budget that supports 160,000 interviews per year once a design is fully phased in.6 The total of 160,000 interviews per year is the number entailed by full implementation of the original SIPP design, that is, each year having a new panel that is interviewed three times, a panel in its second year that is interviewed three times, and a panel completing its term that is interviewed two times, with all panels having art original sample size of 20,000 eligible households. Note that none of the other designs has more than two panels in the field at the same time. 5However, in the section on sample design considerations, we discuss extending the length of SIPP panels for a longer period than whatever is the standard length for the full sample- for subgroups of interest as a means of adding sample size and longitudinal information for the subsampled groups. 6Attrition will reduce the number of required interviews: eligible households that do not respond in the first wave are dropped from the sample; eligible households that subsequently fail to respond are pursued for one more interview before being dropped. Formation of new households by original sample members will somewhat offset the effects of attntion. Also, at the first wave, an additional 4,000-5,000 visits are required to addresses that turn out to be vacant, demolished, or nonresidential (i.e., not eligible). Because of budget cuts, the Census Bureau has actually fielded no more than about 100,000-120,000 interviews in most years. Note that, for simplicity, we assume that interviews catty the same average cost under each design, that is, that the cost of a 6-month recall interview is the same on average as the cost of a 4-month interview. We also do not take into account any extra data collection costs that could result for longer panels from greater dispersion of the sample due to geographic mobility.
SURVEY DESIGN 97 Current Design Start a new panel every year; run each panel for 32 months and interview in 4-month waves, for a total of eight interviews. The sample size per panel is 20,000 originally eligible households. Alternative Design A Start a new panel every 2 years; run each panel for 4 years (48 months) and interview in 6-month waves, for a total of eight interviews (two per year). The sample size per panel is 40,000 originally eligible households. (Two interviews times two panels times 40,000 equals 160,000 interviews per year.) Alternative Design B Start a new panel every 2 years; run each panel for 4 years and interview in 4-month waves, for a total of 12 interviews (3 per year). The sample size per panel is 26,650 originally eligible house- holds. (Three interviews times two panels times 26,650 equals 160,000 interviews per year.) Alternative Design C Start a new panel every 2-1/2 years; run each panel for 5 years and interview in 6-month waves, for a total of 10 inter- views (2 per year). The sample size per panel is 40,000 originally eligible households. (Two interviews times two panels times 40,000 equals 160,000 interviews per year.) Alternative Design D Start a new panel every 3 years; run each panel for 6 years and interview in 6-month waves, for a total of 12 interviews (2 per year). The sample size per panel is 40,000 originally eligible house- holds. (Two interviews times two panels times 40,000 equals 160,000 inter- views per year.) We initially considered another very different design that strives to reconcile the widely voiced desire for larger sample size with the view that cross-sectional uses require short reference periods and frequently refreshed samples (Doyle, 1992~. In brief, this scheme would encompass two related kinds of surveys: (1) large, annual cross-section surveys, designed to ob- tain highly robust information for January of each year, and (2) small 2-year panels, introduced annually in midyear as subsets of the cross-sectional samples and designed to provide monthly information from six 4-month waves for limited analysis of program dynamics. More precisely, this design would do the following: start a new panel every year; field a large initial cross-section and interview once with a 1- month reference period; then, 6 months later (to allow time to draw the subsample), continue a subsample for 2 years, interviewing in 4-month waves, for a total of six interviews (three per year). The cross-section sample size is 55,000 eligible households and each panel subsample includes 17,500 originally eligible households. (55,000 plus three interviews times two panels times 17,500 equals 160,000 interviews per year.) To make the relatively small panels more useful for certain kinds of analysis, Doyle (1992) proposes to oversample a particular target group in each panel: for
98 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION example, oversample low-income people in one panel and higher income people the next. We early on determined that the costs of the Doyle design were likely to outweigh its possible benefits. As a practical concern, the Census Bureau would have to gear up each year for a very large cross-sectional survey and then scale down its operations to handle the much smaller panels. More- over, the cross-sectional survey component would provide estimates only for the month of January, while the panel survey component would provide longitudinal data only for 2 years for small samples.7 This design also introduces new panels on an annual basis a feature that we argue below is a major complication for SIPP data processing and use under the current design. Our discussion in the next section considers the likely effects that de- signs A-D would have on the quality and utility of SIPP data in comparison with the current design. Each design makes tradeoffs within a fixed field budget. For example, design A increases the sample size and overall length of each panel in comparison with the current design, but lengthens the reference period and reduces the frequency with which new panels are in- troduced. Designs C and D have 6-month reference periods like design A, but further lengthen each panel and reduce the frequency with which new panels are introduced. Design B retains the 4-month reference period of the current design, but provides fewer additional sample cases than the other designs. Our challenge was to assess the implications of these design choices for the "bottom line": the ability of SIPP to provide high-quality, relevant data for research and policy analysis related to income and program partici- pation. In considering alternative choices of panel length and number of inter- views, we focused on the implications for errors in panel survey estimates due to the following factors: · attrition-or the cumulative loss from the sample over time of people who cannot be located or no longer want to participate, which can bias survey estimates and also reduce the sample size available for analysis; · time-in-sample effecter changes in respondents' behavior or re- porting of their behavior due to their continued participation in the survey; and · censoring of spells of program participation, poverty, and other be- haviors that is, the failure to observe the beginning and ending dates of all spells within the time span covered by the panel. (We also considered the implications of panel length for analysis of transitions and spells more gen- erally.) 7The proposed solution to the problem of small sample sizes, namely, to oversample differ- ent groups each year, would complicate the design and use of survey.
SURVEY DESIGN 99 In considering the choice of length of reference period, we focused on two kinds of errors: · respondents' faulty recall, which is usually assumed to get worse as the period about which the respondent is queried is farther away; · a related phenomenon known as the "seam" problem, whereby more changes (e.g., transitions in program participation or employment or changes in benefit amounts) are reported between months that span two interviews (e.g., the last month covered by wave 1 and the first month covered by wave 2) than are reported between months that lie entirely within the refer- ence period of one interview. In considering the choice of how often to introduce new panels, we looked at the possible reductions in error for cross-sectional estimates- reductions both in sampling error and in bias from attrition and time-in- sample effects afforded by the opportunity to use newer panels. We also looked at the negative effects of more frequent panels, one of which is a reduction in sample size available for longitudinal analysis of single panels. Negative effects can also stem from what we term the "complexity factor": specifically, having multiple panels in progress at the same time can in- crease the burden on interviewers and data processing operations, which, in turn, can introduce errors and reduce timeliness of data products. A com- plex design can also affect the costs to users of accessing and analyzing the data. Finally, given the importance of sample size to users, we considered the implications of alternative sample sizes for cross-sectional and longitu- dinal uses of the data. We attempted, whenever possible, to quantify the relationships of the venous design dimensions to the venous sources of error.8 Such quantifi- cation is highly desirable for making informed choices among design alter- natives. For example, in considering the optimum panel length and number of interviews, it is not enough to note that attrition bias and time-in-sample effects are assumed to worsen as a function of the number of interviews and also, perhaps, of the overall length of the panel, and that censoring is re- duced with an increase in panel length. One needs to know the relative size of these effects and their implications for important uses of the data. Unfor- tunately, the literature does not always provide clear guidance, and, ulti 80ther sources of nonsampling error appear related primarily to questionnaire design and data collection procedures and hence are not discussed here. They include undercoverage of population groups in the survey (see Chapter 7), nonresponse to specific questionnaire items that is not a function of length of recall, and reporting errors that are not a function of length of recall. Jabine, King, and Petroni (1990) provide an excellent review of the literature on sapling and nonsampling errors in SIPP. Other useful sources are Kalton, Kasprzyk, and McMillen (19891; Lepkowski, Kalton, and Kasprzyk (1990); and Marquis and Moore (1989, 1990a).
100 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION TABLE 4-1 Cumulative Household Noninterview and Sample Loss Rates, 1984-1988 and 1990 SIPP Panels (in percent) 1984 Panel Noninterview 1985 Panel Noninterview 1986 Panel Noninterview Wave Type A Type D Loss Type A Type D Loss Type A Type D Loss 1 4.9 4.9 6.7 6.7 7.3 7.3 2 8.3 1.0 9.4 8.5 2.1 10.8 10.8 1.5 13.4 3 10.2 1.9 12.3 10.2 2.7 13.2 12.6 2.3 15.2 4 12.1 2.9 15.4 12.4 3.4 16.3 13.8 3.0 17.1 5 13.4 3.5 17.4 14.0 4.1 18.8 15.2 3.7 19.3 6 14.9 4.1 19.4 14.2 4.8 19.7 15.2 4.3 20.0 7 15.6 4.9 21.0 14.4 5.2 20.5 15.3 4.8 20.7 8 15.8 5.7 22.0 14.4 5.5 20.8 9 15.8 5.7 22.3 NOTES: Differences in rates for the 1984 panel in comparison with subsequent panels may be due in part to differences in the sample design. Rates are not shown for the 1989 panel because it lasted only 3 waves. Type A noninterviews consist of households occupied by persons eligible for interview and for whom a questionnaire would have been filled if an interview had been obtained. Reasons for Type A Noninterview include: no one at home in spite of repeated visits, temporarily absent during the entire interview period, refusal, and unable to locate a sample unit. Type D noninterviews consist of households of original sample persons who are living at an unknown new address or at an address located more than 100 miles from a SIPP PSU and for whom a telephone interview is not conducted. mately, we have relied on our professional judgments in recommending design changes to SIPP. Attrition All household surveys are subject to unit nonresponse, that is, the failure to locate or obtain the cooperation of some fraction of the eligible households (or of individual members of otherwise cooperating households). Panel surveys are also subject to wave nonresponse, or attrition, at each succes- sive interview.9 9More precisely, total sample loss at each interview, or total wave nonresponse, includes attrition per se, that is, nonresponse by households that are never brought back into the survey, plus nonresponse of households that miss a wave but are successfully interviewed at the next wave. (In SIPP, households that miss two interviews in a row are dropped from the survey.) In addition, in every SIPP interview, there are "Type Z" nonrespondents, that is, individual members of otherwise cooperating households for whom no information is obtained, either in person or by proxy.
SURVEY DESIGN 101 1987 Panel Noninterview 1988 Panel Noninterview 1990 Panel Noninterview Type A Type D Loss Type A Type D Loss Type A Type D Loss 6.7 6.7 7.S - 7.5 7.1 7.1 11.1 1.5 12.6 11.4 1.5 13.1 10.9 1.5 12.6 1 1.5 2.6 14.2 12.0 2.3 14.7 1 1.5 2.5 14 4 12.3 3.3 15.9 13.0 3.0 16.5 12.6 3.3 16.5 13.7 4.1 18.1 13.9 3.3 17.8 13.7 4.5 18.9 13.6 4.9 18.9 13.6 4.0 18.3 14.1 5.2 20.1 13.6 4.9 19.0 14.3 5.8 21.0 N.A N.A N.A The sample loss rate consists of cumulative noninterview rates adjusted for unobserved growth in the Noninterview units (created by splits). aRates for 1990 are for the nationally representative portion of the sample; they exclude the households that were continued from the 1989 panel. N.A., Not available. SOURCE: Data from Jabine, King, and Petroni (1990:Table 5.1) and unpublished tables from the Census Bureau. Attrition reduces the number of cases available for analysis including the number available for longitudinal analysis over all or part of the time span of a panel and the number available for cross-sechonal analysis from later interview waves and thereby increases the sampling error or variance of the estimates. More important, people who drop out may differ from those who remain in the survey. To the extent that adjustments to the weights for survey respondents do not compensate for these differences, estimates from the survey may be biased. Evidence on Attrition To date, the wave nonresponse rates from SIPP show a definite pattern (see Table 4-1~. Total sample loss in the 1984-1988 and 1990 panels is highest at the first and second interviews 5-8 percent of eligible households at wave 1 and an additional 4-6 percent of eligible households at wave 2. Thereafter, the additional loss is only 2-3 percent in each of waves 3-5 and less than 1 percent in each subsequent wave.~° By iOIndeed, looking closely at later panels in comparison with earlier ones, the numbers sug- gest that SIPP interviewers are experiencing somewhat less success in obtaining responses from households in waves 1 and 2 of later panels but better success in retaining cooperative households for subsequent waves of later panels.
102 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION wave 6 (after 2 years of interviewing), cumulative sample loss from SIPP is 18-20 percent of eligible households; by wave 8, it is 21-22 percent. At later waves, increased attrition is almost entirely a function of "ripe D" loss, that is, the loss of scruple members who move and cannot be located or who move more than 100 miles from a SIPP primary sampling unit and cannot be interviewed by telephone. The increase in "Type A" loss at later waves is virtually nil: Type A cases are those households for which no interview is obtained because there was no one at home, the occupants were temporarily absent, or the occupants refused to give out information. Refusals account for most Type A attrition: they accounted for 70-76 per- cent of wave 1 nonresponse in the first four SIPP panels (Bowie, 1988:8~. The attrition patterns in SIPP are similar to those of other panel sur- veys. The 1979 ISDP research panel experienced a total sample loss of 18.1 percent after 6 waves (with 3-month reference periods), compared with sample loss rates for the SIPP 1984-1988 panels of 18-20 percent after 6 waves (with 4-month reference periods).1i As in SIPP, the largest attrition rates occurred in ISDP in the early waves; for example; the wave 1 sample loss in the ISDP was 8.5 percent (Nelson, Bowie, and Walker, 1987:7-8~. The PSID, which has conducted annual interviews of sample members since 1968, experienced a large sample loss 24 percent-at the first interview,:2 but additional sample loss dropped to 14 percent of the eligible members at the second interview and was only 2-3 percent at each interview thereafter (Survey Research Center, 1986~. The evidence suggests that attrition is primarily a function of the num- ber of interview waves rather than the total length of the panel. Hence, design A, with eight interviews over a 48-month (4-year) span, might have a level of attrition comparable to that of the current design, with eight inter- views over a 32-month span. Design B. with 12 interviews over a 4-year span, and design D, with 12 interviews over a 6-year span, would likely have more attrition than the other designs. However, the differences would not be great, given the finding that attrition drops off dramatically after the first three or four waves. Our best estimate is that designs B and D would experience no more than 25 percent attrition by the end of a panel, compared with 21-22 percent for the current design. Tempering our confidence in this estimate is the lack 1lThe sample loss in the ISDP might well have been higher except that a special effort was made in the last interview (wave 6) to convert Type A correspondents from previous waves. ]2I-his loss was partly due to the PSID sample design, which included a national probability sample of about 3,000 families and a sample of about 2,OOO low-income families drawn from the sample used for the 1967 Survey of Economic Opportunity (SEO). Several factors in- creased the nonresponse for the SEO sample, including the requirement by the Census Bureau that SEO families sign a release allowing their names to be passed on to the PSID (Hill, 1992).
SURVEY DESIGN 103 of direct evidence about attrition for panels of moderate length with fre- quent interviews. The PSID has experienced very little attrition after the initial waves despite its very long length, but respondents know that they will be interviewed only once a year, and the PSID interviewing staff have a long period each year in which to track down respondents. Conversely, under the current SIPP design, respondents are interviewed frequently but know the survey will be completed in a couple of years. It is possible that design B. which lengthens the panel and retains frequent interviewing, could result in higher initial refusal rates at wave 1 and hence a higher overall attrition rate than the other designs. Evidence on Attrition Bias Attrition does not necessarily introduce bias into survey estimates. Several studies of the PSID in the 1980s found that, although cumulative sample loss by that time was over 55 percent, there was no evidence that attrition correlated with individual characteris- tics in a way that would produce biased estimates.~3 For example, Becketti et al. (1988:490) found no evidence that attrition "has any effect on esti- mates of the parameters of the earnings equations that we studied" (see also Curtin, Juster, and Morgan, 1989, and other studies cited in Hill, 1992~. The evidence from SIPP is less encouraging. Studies of nonresponse from the 1984 panel show that household noninterview rates after the first wave tended to be higher for renters, for households located in large metro- politan areas, and for households headed by young adults. Individuals who did not complete all of the interview waves, compared with those who did, tended to include more residents of large metropolitan areas, renters, mem- bers of racial minonties, children and other relatives of the reference per- son, people aged 15-24, movers, never-marred people, and people with no savings accounts or other assets (Jabine, King, and Petroni, 1990:35-37, Table 5.4). Furthermore, there is evidence that the current noninterview weighting adjustments for SIPP do not fully compensate for differential attrition across subgroups. One evaluation of the procedures to adjust for household nonresponse at each wave developed two sets of weights for wave 2 households in the 1984 panel-one set based on all wave 2 households and one set based just on those wave 2 households that provided interviews at wave 6. Comparing wave 2 estimates from these two samples showed that the latter set pro 13Duncan, Juster, and Morgan (1984) report an interesting study that simulated the effects of less intensive efforts to interview respondents in the PSID (e.g., fewer follow-up Calls) in the period from 1973 to 1980. The resulting sample included only two-thirds of the actual PSID respondents in 1980, and nonresponse after the first wave for the simulated sample was significantly related to several f~rst-wave eharaeteristies, particularly race, income, and age. However, reweighting the simulated sample for differential nonresponse minimized the differ- enees between the estimates from that sample and the actual sample.
104 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION duced higher estimates of median income and fewer households with low monthly income compared with the former set, evidence that the weights do not adequately adjust for higher attrition rates among low-income house- holds (Petroni and King, 1988~. A subsequent study that compared samples from the 1985 panel of all wave 2 households and those that provided interviews at wave 6 obtained similar findings (King et al., 1990~. It is important to note that current cross-sectional nonresponse adjust- ments in SIPP make only minimal use of the information that is available from previous waves for many current nonrespondents. Also, in construct- ing longitudinal files from SIPP panels, the Census Bureau assigns zero weights to original sample members who missed only one or a few waves in addition to those who missed all or most waves. In Chapter 7, we urge research on ways to improve nonresponse adjustments in SIPP (some of which the Census Bureau is already investigating). Here, we note that the alternative designs under consideration would likely add little to the bias or sample loss due to attrition, given the evidence of markedly reduced attri- tion rates at later interview waves. Time-in-Sample Effects People who participate in panel surveys may, with successive interviews, change their behavior or their reporting of their behavior in ways that bias the survey estimates. They may acquire new knowledge that affects their behavior: for example, they may apply for benefits from public assistance programs as a direct consequence of learning about such programs from the survey. They may also gain experience with the questionnaire that leads them to change their responses: for example, they may learn to give a "no" answer to an early question in order to shorten the interview. Not all such changes necessarily introduce bias for example, respondents may gain a better understanding of the meaning of a question over time and hence provide more valid responses at later than at earlier interviews. In practice, it is often difficult to distinguish time-in-sample or panel conditioning effects from other changes across waves, notably attrition. Moreover, the finding that respondents' answers differ across waves due to conditioning does not establish whether the reports for the later or the ear- lier waves are more accurate; the reports would have to be compared with other independent studies. Evidence from Other Surveys The literature on panel conditioning is not extensive (see Kalton, Kasprzyk, and McMillen, 1989; Lepkowski, Kalton, and Kasprzyk, 1990~. Most of the studies examine conditioning effects in continuing surveys, such as the CPS, that are designed to produce regular cross-sectional estimates and use a rotation group scheme, whereby each
SURVEY DESIGN 105 month's (or quarter's or 6-months') sample includes some respondents wh are new to the survey and others who have been interviewed before. Stud- ies of such surveys have regularly documented "rotation group bias," al- though it is not known to what extent such bias is due to conditioning effects per se. For example, the unemployment rate estimated for house- holds in the incoming CPS rotation group each month is 7 percent higher than the average for all eight rotation groups (Bailer, 1989:Table 6~. (The CPS interviews households at sample addresses for 4 months in a row, drops them from the sample for 8 months, and then interviews them again for another 4 months.) Rotation group bias has also been found in the Canadian Labour Force Survey, and studies of the National Crime Survey (NCS) have found that victimization rates decline for rotation groups the longer they have been in the sample (Woltman and Bushery, 1975~. (The NCS interviews households at 6-month intervals, keeping each household in the survey for 3-1/2 years.) A few validation studies have compared panel respondents' reports with outside sources. Traugott and Katosh (1979) found that longer terra mem- bers of a panel survey of election behavior gave more accurate responses on voting behavior and, moreover, actually voted in larger numbers than did newer members. However, it is not clear whether these results are due to panel conditioning, to attrition, or to both factors. Ferber (1964) found that longer term respondents gave better reports of asset holdings in comparison with newer respondents. This improvement was due in part to attrition of the poorer reporters and in part to an improvement in the accuracy of re- porting for the respondents who remained In contrast, Mooney (1962) found that older persons' reports of illness were higher and, compared with their physicians' reports, more accurate in the first interview than in later interviews. (The respondents were more likely to overreport illnesses in the first interview but much more likely to underreport illnesses in later inter- views, so that estimates of illnesses from later interviews showed substan tial downward biases.) Lepkowski, Kalton, and Kasprzyk (1990:10) conclude from the litera- ture that "where panel bias [conditioning] is observed, there is no consensus about the inevitability of the effect, or its size. In the same panel surveys where panel conditioning has been found for some items, it is small or absent from others." Evidence from SIPP Several recent studies have examined condition- ing effects in SIPP. None of the available studies completely separates out the effects of attrition, nor do most of them assess the validity of reports from later waves in comparison with earlier waves. Lepkowski, Kalton, and Kasprzyk (1990) compared responses from wave 4 of the 1984 panel with wave 1 of the 1985 panel for original sample
106 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION persons. They found insignificant differences between respondents in the two panels in reports of receipt of social security, AFDC, and food stamp benefits and in reports of receipt of social security income and personal earnings for January 1985. However, they found significant differences in reported levels of AFDC income and in reports of unemployment: in both cases, respondents who had been interviewed four times had lower levels of income and less unemployment than those interviewed once. They specu- late that respondents may gain a better understanding of selected questions over time and hence improve their reporting. In a continuation of this work, Pennell and Lepkowski (1992) found only scattered instances of differences between respondents in the 1985- 1987 SIPP panels, and the differences were not always in the same direc- tion. For example, respondents in the 1985 panel reported significantly lower receipt of assets for calendar year 1986 than did respondents in the 1986 panel. Conversely, respondents in the 1985 panel reported signifi- cantly higher amounts of income from general assistance for calendar year 1986 than did respondents in the 1986 panel. There were also few differ- ences between respondents in the 1985-1987 panels in estimates for specific calendar months. McNeil (1991) compared estimates across 20 quarters using data from the 1984-1988 SIPP panels. He determined for each variable whether there were very few or very many quarters in which earlier and later waves differed. He found little effect for estimates of median income and program participation, somewhat larger effects for poverty and unemployment rates for women, and a large effect for health insurance coverage (people were less likely to report lacking coverage the longer they were in sample). McCormick, Butler, and Singh (1992) compared quarterly estimates of earnings, labor force activity, poverty, and program participation for 1985- 1987, using data from the 1984-1987 SIPP panels.~4 In general, they found little evidence of time-in-sample effects. They did find significant differ- ences occurring across panels when comparing estimates for the first quar- ter of each year, indicating that there may be systematic differences be- tween wave ~ and subsequent interviews. They suggest as possible reasons for these differences that wave 1 is an unbounded interview (i.e., the re- spondents have an open-ended time frame from which to recall their an- swers) and that the respondents are just getting to know the interviewers. However, the differences were not always in the same direction. They also found significant differences between the 1984 and 1985 panels for the quarterly estimates for 1985-but not for the other years and panels they examined for which they could offer no explanation. 14Their analysis also involved comparing SIPP estimates of participation in selected pro- grams with estimates from administrative records.
SURVEY DESIGN 107 We cannot draw firm conclusions about the extent or level of biases due to time-in-sample effects that would be introduced or ameliorated by additional interviews in SIPP. However, the available evidence suggests that the effects are limited and, hence, that designs that specify a longer panel length should not be rejected on grounds of panel conditioning. Censoring A problem in longitudinal analysis of the dynamics of program participa- tion, employment, family composition, and other behaviors is that it is rarely, if ever, possible to observe the start and end dates of all spells that are experienced by respondents during the time span covered by the panel sur- vey. Some spells will have started before the survey began and other spells will not end until after the survey is completed. There are ways to address the biases that stem from having incomplete information on spell lengths (see Chapter 6~. However, it is clearly advan- tageous for analysis of spells and transitions to have a longer period of observation. The question is the optimum panel length for a survey, which, in turn, depends on the survey's goals. The PSID and National Longitudi- nal Surveys of Labor Market Experience (NLS) are designed to answer questions about the long-term social and economic outcomes for samples of families and cohorts of individuals as they move through major life stages. SIPP has a shorter focus, which includes providing subannual snapshots of income, employment, and program participation, as well as information on the dynamics of income and program participation over the short and me- dium term. To be useful for analyses of program dynamics, even in the fairly short term, SIPP needs to follow sample members for longer than a year or two and, we believe, for somewhat longer than the 32 months of the current design. With the current panel length, a significant proportion of poverty and program participation spells are right-censored (i.e., they still exist at the end of the panel). For example, 38 percent of AFDC spells, defined on a monthly basis, that began after the start of the 1984 panel were censored (Flory, Martini, and Robbin, 1988:Table 1), as were 27 percent of spells without health insurance (McBride and Swartz, l990:Table 1~.is Extending the panel length from 32 to 48 months would not only enlarge the sample size of spells of program participation, poverty, and other states, but de 15Note that these figures include spells that were nght-censored because of sample reduc- tion in the 1984 panel, as well as those right-censored because the panel ended. The duration of another 15 percent of AFDC spells and 12 percent of spells without health insurance that began in the 1984 SIPP panel was not observed because the respondents dropped out of the survey.
108 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION crease somewhat the proportion that are nght-censored. For example, we estimate that the number of nght-censored AFDC spells would decrease by 10-15 percent. Also, a somewhat longer length would increase the num- ber of times when multiple spells (or recidivism) are observed and provide longer periods of observation before and after events to determine their short-term antecedents and consequences. Recall Error and Seam Effects One of the concerns of the ISDP that laid the groundwork for SIPP was the appropriate recall or reference period for each interview. The goal for SIPP was to obtain improved estimates of annual income, compared with the March CPS, and to obtain estimates of subannual income and program par- ticipation that could be related to administrative data. To serve both pur- poses, the decision was made to ascertain monthly information. It seemed obvious that interviews must be conducted more frequently than once a year in order to minimize recall errors on the part of respondents. Indeed, there is ample evidence that SIPP obtains more complete reporting than the March CPS annual survey for most sources of income and also for part-year work, unemployment, and health insurance coverage (see Jabine, King, and Petroni, 1990:127-129~. The question was how often the interviews could be con- ducted without breaking the field budget or overburdening the interviewers or the respondents and, conversely, how seldom the interviews could be conducted without seriously affecting the quality of the monthly data. The ISDP Site Research Test in October 1977-February 1978 included a 2x2 factorial experiment, in which the two treatments were recall length (one 6-month or two 3-month interviews) and length of the questionnaire (short or long form). The sample for this test included 2,400 respondents in five sites (Dallas, Houston, San Antonio, Milwaukee, and Peoria). Unfortu- nately, there were several statistical problems with the design of the expen- ment and the subsequent analyses, such that virtually no conclusions can be drawn from the results about the efficacy of 3-month or 6-month reference periods in SIPP (Biemer, 1991; Singh, 1987~. The 1978 and 1979 ISDP research panels used 3-month interviews and followed respondents for 15 and 18 months, respectively.~7 Other short- term panel studies, such as the CEX and the 1977 and 1980 National Medi 16This estimate was constructed using data on AFDC spell duration from Ruggles (1989:Ta- ble 1) for the 1984 SIPP panel and from Kalton, Miller, and Lepkowski (1992) for the 1987 SIPP panel. 17The 1979 ISDP panel included an experiment with recall length for a subset of items: one-half of the households were asked about asset income on a 6-month rather than a 3-month basis. The results suggested that accuracy of reporting was reduced in the longer recall period (Yeas and Lininger, 1983 :28).
SURVEY DESIGN 109 cat Care Expenditure Surveys, also have 3-month interviews, while the NCS uses 6-month interviews. Given that SIPP panels were to last longer than most of these other surveys, the decision was made to use 4-month interviews to balance the concern for accuracy of monthly reports with the concern for minimizing survey costs and burden on interviewers and respondents. We could find only a few studies from other surveys that bear on the question of whether moving from 4-month to 6-month interviews, as pro- posed in designs A, C, and D, would impair data quality from SIPP (see Chu et al., 1992, for an overview of the literature on recall effects). Bushery (1981), in research conducted for the NCS, found that reported-victimiza- tion rates were higher for 3-month than for 6-month reference periods and that the 6-month rates, in turn, were higher than rates for a 12-month refer- ence period. Neter and Waksberg (1964, 1965), in research on the CEX, found that reporting of house repairs and alterations of small dollar value (less than $10) was not affected by lengthening the reference period from 1 to 3 months, but the reporting declined by 20 percent when the reference period was lengthened from 1 to 6 months and by 11 percent when the period was lengthened from 3 to 6 months. They suggested that this effect would be less pronounced for house repairs and alterations with larger dol- lar values, which respondents could more easily recall. The limited evi- dence from these studies suggests that lengthening the recall period for SIPP might reduce reports for small sources of income or short spells of program participation, unemployment, etc. Another cause for concern with lengthening the reference period is the seam problem, which was first documented in the 1979 ISDP (Moore and Kasprzyk, 1984), but has since been found in other panel surveys, including SIPP (see Jabine, King, and Petroni, 1990:58-61; Kalton and Miller, 1991) and the PSID (Hill, 1987~. As noted above, the seam problem refers to higher levels of reported changes (e.g., going off or on a welfare program) between pairs of months that span two interviews (e.g., for SIPP, months 4- 5, 8-9, 12-13, etc.) than between pairs of months for which data are col- lected from the saline interview. The seam phenomenon affects most variables for which monthly data are collected in SIPP-often strongly. For example, in the first year of the 1984 SIPP panel, four times as many social security participants reported exiting the program between months that spanned interviews as between months within the reference period of a single interview. Similarly, over twice as many nonparticipants reported entering the social security program between seam months than nonseam months (Jabine, King, and Petroni, l990:Table 6.2~. The reasons for the occurrence and extent of the seam phenomenon are not well understood. Research to date has found few links to characteristics of respondents, edits and imputations, proxy versus self-response, or changes
0 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION in interviewer assignments (Lepkowski, Kalton, and Kasprzyk, 1990:7-8; Marquis and Moore, 1990a), although Kalton and Miller (1991) found some effects of proxy reporting for social security payments. Analysis of the cognitive processes that SIPP respondents use to answer questions suggests that they often adopt simple rules in place of making the effort to recall the information (Cantor et al., 1991; Marquis, Moore, and Bogen, 1991~. Hence, one explanation for the seam problem may be that some respondents who experience a transition in the middle of a reference period simply report their current status for all 4 months of the reference period. Thus, a respon- dent being interviewed in May who entered the AFDC program-in March would report receipt of AFDC for January and February as well as for March and April. Comparison of these reports with reports of no AFDC participation in the previous January's interview would (erroneously) date the transition at the seam between the two interviews. Whatever the mechanism, the seam problem clearly results in errors in the timing of transitions in SIPP and the duration of spells of participation. It may or may not result in errors in the number of transitions that occur within a given period. For example, in the case of food stamps, total exits and entrances from SIPP are close to the rates derived from food stamp administrative records. In contrast, whether due to the seam effect or other factors, entrance rates from SIPP for SSI are significantly higher than those shown by program records (Jabine, King, and Petroni, 1990:59-60~. The Census Bureau is currently pursuing research and testing of alter- native questionnaire designs that could reduce the seam problem (Marquis, Moore, and Bogen, 1991; see discussion in Chapter 7~. However, it is not likely, in our view, that this research will produce definitive answers by the time that SIPP is redesigned. We are concerned that extending the length of the SIPP interview reference period from 4 to 6 months as part of the redesign could exacerbate the seam problem. In general, given the need for accurate monthly data to serve the goals of SIPP and the possibility that introducing too many changes to the design could have adverse effects, we are hesitant about extending the reference period in the absence of more definite research knowledge about the likely consequences. Yet longer ref- erence periods would have the undoubted advantage of permitting an in- crease in sample size and perhaps an increase in length of panels. Clearly, research on recall effects should be a priority area for the Census Bureau. Overlapping Panels One of the major features of the current SIPP design the yearly overlap- ping of panels was adopted with the goal of maximizing sample size and minimizing the effects of attrition and time-in-sample biases for cross-sec- tional estimates. Introducing new SIPP panels annually afforded users the
SURVEY DESIGN 111 opportunity to combine panels for months, quarters, or years. This strategy was expected to increase sample size and reduce bias by mixing cases that had been in the field for more than a year with fresh cases. Unfortunately, actual experience with the option of combining panels has not been encour- aging. Indeed, what happened is an instance in which a design feature that was expected to have positive effects on data quality so complicated the survey operations as to have the opposite result. At the start of SIPP, having to cope with a new panel every year (fre- quently with changed content) put great strains on the data processing sys- tem at the Census Bureau, which led to serious delays~ver 3 years in some cases in releasing data products (Committee on National Statistics, 1989:Table 2-4~. These delays discouraged users from combining panels and hence left them with smaller sample sizes- and higher sampling er- rors for cross-sectional estimates than originally planned. The difficulty of combining panels, coupled with the forced reductions in sample size for the 1985-1989 panels, have had the result that most users to date have confined their analyses to the larger 1984 panel. These problems in turn have motivated the preference of many users for larger panels introduced on a less frequent basis. The Census Bureau originally expected to be a heavy user of combined panels for input to annual and subannual cross-sectional estimates of in- come and program participation. However, after producing six quarterly income reports based solely on the 1984 panel, the Census Bureau did not for several years produce any statistics from the core data. Instead, most reports were based on the topical modules. More recently, the Bureau has issued reports on income and program participation, using complete panel files, that have focused on such longitudinal issues as duration of participa- tion and year-to-year change in economic status. The Bureau's future plans for a regular report series from the core data include cross-sectional as well as longitudinal statistics (see Chapter 6~. The question for the future is whether it will be possible for overlap- ping panels to serve the original goal of reducing sampling error and bias in cross-sectional estimates. On a positive note, the Census Bureau has made great strides in the past few years in regularly meeting a schedule of releas- ing data products within a year after data collection. However, this success has come at the price of greatly reducing SIPP's flexibility. Essentially, except for the variable topical modules, the Census Bureau has permitted very few changes in the questionnaire content. There is also little capacity in the data processing operations to keep up with new technology (see Chapter 5) and, in particular, to get ready for such major proposed changes as the use of computer-assisted interviewing. We conclude that overlapping panels on an annual basis will inevitably impose substantial costs on interviewers, data processors, and users. As
112 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION suming that each panel is longer than 2 years, introducing panels annually implies having at least three distinct panels in operation for most of each year, with one of them in a start-up mode. As a result, interviewers must cope with different questionnaires: for example, under the current design, the wave 1 and 2 questionnaires for the new panel differ from the wave 4 and 5 and wave 7 and 8 questionnaires for the other two panels that are still in operation.l8 Data processors must go through the same operations of editing, imputing, reformatting, and weighting separately for each of three panels, which inevitably adds to costs and the opportunities for malting mistakes. Users must go through an extra set of steps in order to combine panels for analysis. Other designs that we considered also overlap panels, but at a less frequent rate: designs A and B introduce new panels every 2 years, design C every 2-1/2 years, and design D every 3 years. Such designs permit larger sample sizes per panel, with the important benefit that many users may never need to combine panels. Such designs also never have more than two panels in the field at the sense time and give users and producers a breather between the introduction of new panels. On the down side, such designs mean that, for some years, cross-sectional estimates will involve older panels. For example, under designs A and B. estimates for every other year will include cases from panels in their second and fourth year, but none from a panel in its first year. We believe that improved weighting adjustments can compensate for attrition and time-in-sample effects, so that the benefits of less frequent introduction of new panels will more than outweigh the costs. Sample Size A pervasive complaint about SIPP panels is that they provide too few sample cases for observation of subgroups of policy interest for example, recipi- ents of food stamps and AFDC. We made rough estimates of the number of cases for these two populations that would be available for analysis by the end of waves 1, 4, 8, and 12 from SIPP panels of different size initial samples (without any oversampling of low-income groups or combimng of panels). The results (see Table 4-2) show clearly the inadequacy of the smaller SIPP panels that were fielded in 1985-1989. Only 470 cases of AFDC units would be available even at wave 1 from a panel with an initial sample size of 12~500 households. This number would not be sufficient to 18Wave 1, as the start-up interview, has a different format from that used in subsequent waves; wave 2 includes personal history modules that are not repeated in any other interview. For subsequent interviews, the Census Bureau strives to field the same topical modules, for example, asking about wealth in wave 7 of one panel and wave 4 of the next panel, both of which are fielded at the same time (see Table 3-13 in Chapter 3).
SURVEY DESIGN TABLE 4-2 Estimated Minimum Sample Sizes for Subgroups of Policy Interest from SIPP Panels of Different Sizes 113 Subgroup Initial Sample Size (Households) and Wave12,500 17,500 20,00026~700 40,000 Food stamp recipients Wave 11,160 1,630 1,8602,480 3~720 Wave 41,060 1,490 1,7002,270 3,400 Wave 8990 1,380 1,5802,110 3,160 Wave 12940 1,310 1,5002,000 - 3,000 AFDC recipients Wave 1470 650 740990 1,490 Wave 4430 600 680910 1,360 Wave ~400 550 630840 1,260 Wave 12380 530 600800 1,200 NOTES: Calculations are for single panels (not for combined panels) and assume that: food stamp recipient units are 10 percent of total households and AFDC recipient units are 4 percent of total households and attrition is a function of the number of waves: cumulative attrition is 7 percent of the initial sample size at wave 1, 15 percent at wave 4, 21 percent at wave 8, and 25 percent at wave 12. Results are rounded to the nearest 10 and are labeled as "minimum sample sizes" because no account is taken of the increase in sample cases that is likely to occur due to household formation by original sample members or that could be obtained by combining panels or by oversampling low-income people. However, no account is taken of the decrease in sample cases that could occur because of higher attrition rates for low-income people. study subgroups of the AFDC population for example, there would be fewer than 50 cases of AFDC units with earnings, a group of considerable policy interest that comprises less than 10 percent of the caseload (Citro and Hanushek, 1991a: 1299. More sample size would be available for food stamp recipients, but the numbers are still small for detailed analysis. The initial sample size of 20,000 households that was the original goal for SIPP yields more cases of food stamp and AFDC recipients, but the numbers are still relatively small. The initial sample size of 26,700 house- holds (proposed in design B) is significantly better even by wave 12 it provides more cases than the 20,000 sample does in wave 1. There are, of course, ways to improve the efficiency of the SIPP sample by oversampling subgroups of interest, and we consider such designs below. However, we believe that there is an argument for appreciably increasing the SIPP panel sample size, even if overs~npling strategies are used.~9 19We note in Chapter 5 that changes in the data collection strategy for SIPP specifically, implementation of computer-assisted personal interviewing (CAPI)-could result in savings from eliminating the need for large regional office editing operations. These savings might well support an added increase in the sample size of SIPP panels.
4 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION Rotation Groups For each of the designs considered above, we made the assumption that the Census Bureau would continue to use a monthly rotation group scheme and not alter this aspect of the survey. The practical need to have a fairly even workload for the Census Bureau interviewers is a compelling argument in favor of monthly rotation groups. In addition, a monthly rotation scheme will smooth out the effects of recall errors on cross-sectional estimates for calendar periods and ensure that the reference period is as close as possible to the time of interview for all respondents.20 However, there are drawbacks to the use of monthly rotation groups. For one thing, it complicates analysis of SIPP data. To produce calendar- penod estimates for specific months, quarters, or years, users need data for different reference months for each rotation grou~and often the appropn- ate reference months are in different waves. In addition, a monthly rotation group scheme may increase attrition. If interviewers must try to close out each month's workload by the end of that month, or very soon after, they will have less time to follow up hard-to- track cases than if the interviewing could extend over a longer period. Most important, when coupled with any design in which the panel length is an even multiple of 12 months, the use of monthly rotation groups leaves a gap in the completeness of the data for the last calendar year of each panel. Thus, under design B of 4-year panels, with interviewing starting in February of the first year, only one group will have complete data for the fourth calendar year, and the other three groups will be missing 1 to 3 months of data; see the column labeled "Begin in February" in Table 4-3. For example, the first rotation group, with its first interview in February of year 1, will have its twelfth and last interview in October of year 4, and hence be missing data for October-December of year 4. Designs A, C, and D also have this problem for the last year of each panel.21 These missing data will impair the ability to construct reliable estimates for the calendar year involved or for year-to-year compansor~s.22 20Given monthly rotation groups and 4-month interview waves, the recall length for calen- dar-period cross-sectional estimates will vary across the sample: for example, for a calendar- month estimate, the recall period will vary from 1 to 4 months, with an average recall length of 2-1/2 months. In contrast, if all interviews were conducted at the end of each 4-month refer- ence period, different calendar months would have different recall lengths (e.g., 4 months for January, May, September; 3 for February, June, October, etc.). In addition, it is likely that with a bunched-up workload, interviewing would have to extend for longer periods (e.g., interviews for the January-April reference period might have to be conducted in June or even July as well as in May), which would further lengthen the recall period. 21In the case of design C, which introduces panels at intervals of 2-1/2 years, there is the further complication that every other panel begins in the middle of a year. 22The missing data are not necessarily a problem for average monthly estimates (e.g., of poverty or program participation) for a calendar year, which we propose as a basic component of the cross-sectional reports from SIPP (see Chapter 6). Such estimates could pull in data for the missing months from the first interview of the next panel that is starting up.
SURVEY DESIGN TABLE 4-3 Reference Penods for Rotation Groups for SIPP Redesign 115 Begin in February Begin in March Reference Rotation Group Rotation Group Month and Year 1 2 3 4 1 2 3 4 Year0 October 1 November 1 1 1 December 1 1 1 1 1 Year 1 January 1 1 1 1 1 1 1 1 February 2 1 1 1 1 1 1- 1 March 2 2 1 1 2 1 1 1 Apnl 2 2 2 1 2 2 1 1 May 2 2 2 2 2 2 2 1 June 3 2 2 2 2 2 2 2 July 3 3 2 2 3 2 2 2 August 3 3 3 2 3 3 2 2 September 3 3 3 3 3 3 3 2 October 4 3 3 3 3 3 3 3 November 4 4 3 3 4 3 3 3 December 4 4 4 3 4 4 3 3 Year2 January 4 4 4 4 4 4 4 3 February 5 4 4 4 4 4 4 4 March 5 5 4 4 5 4 4 4 Apnl 5 5 5 4 5 5 4 4 May , 5 5 5 5 5 5 5 4 June 6 5 5 5 5 ~5 5 July 6 6 5 5 6 5 5 5 August 6 6 6 ~6 6 5 5 September 6 6 6 6 6 6 6 5 October 7 6 6 6 6 6 6 6 November 7 7 6 6 7 6 6 6 December 7 7 ~6 7 7 6 6 Year 3 January 7 7 7 7 7 7 7 6 February 8 7 7 7 7 7 7 7 March 8 8 7 7 8 7 7 7 April 8 8 8 7 8 8 7 7 May 8 8 8 8 8 8 8 7 June 9 8 8 8 8 8 8 8 July 9 9 8 8 9 8 8 8 August 9 9 9 8 9 9 8 8 September 9 9 9 9 9 9 9 8 October 10 9 9 9 9 9 9 9 November 10 10 9 9 10 9 9 9 December 10 10 10 9 10 10 9 9 Year 4 January 10 10 10 10 10 10 10 9 February 11 10 10 10 10 10 10 10 March 11 11 10 10 11 10 10 10 Apnl 11 11 11 10 11 11 10 10 May 11 11 11 11 11 11 11 10 continued on next page
116 TABLE 4-3 Continued THE SURVEY OF INCOME AND PROGMM PARTICIPATION Begin in February Begin in March Reference Rotation Group Rotation Group Month and Year 1 2 3 4 1 23 4 Year 4 conned June 12 11 11 11 11 1111 11 July 12 12 11 11 12 1111 11 August 12 12 12 11 12 1211 11 September 12 12 12 12 12 1212- 11 October 12 12 12 12 1212 12 November 12 12 1212 12 December 12 12 12 Year 5 January 12 NOTE: The numbers in the table are the interview wave numbers for each rotation group and show the reference period covered. Thus, the first rotation group under a design of 4-year panels in which interviewing begins in February of the first year will be asked for information about the period from October of the preceding year (year 0) through January of year 1 in their first interview wave. The fourth rotation group under this scheme will be asked for informa- tion about the period from January through April of year 1 in their first interview in May. Under a design in which interviewing begins in March of year 1, the fourth rotation group will be asked for 5 months of information, from the period January through May of year 1 in their first interview in June, in order to obtain complete calendar information for the first year. There are several ways of dealing with this problem.23 One alternative is to impute the missing data, but this approach will introduce nonsampling error into the estimates. Another alternative is to base annual estimates for that year on the other panel that is in the field and not combine the two panels. However, this approach means that annual estimates for years in which a panel ends will have increased sampling error because they are based on just one panel. Still another alternative is to conduct an extra interview to pick up the missing data, but this approach will be costly. 23A related problem is that there will be no annual income or tax information collected for the last calendar year of a panel by means of a subsequent topical module. (In the current design, the seventh and eighth interviews, which result in three panels in the field for most of each year, serve, respectively, to complete the monthly data for the second calendar year of a panel and to obtain the annual income and tax information for that year.) The annual income roundup provides useful information with which to validate the monthly amounts but is not used directly in estimates, so that its omission for the last year is not serious. Also, we urge in Chapter 3 that ways be found to obtain most tax information from administrative records rather than from the survey. Even if this is not possible, it should be possible to readily impute tax information for the last year, given that the respondents will have provided such information for several prior years.
SURVEY DESIGN 117 A fourth possibility that we think may have merit is to conduct a trun- cated extra interview that collects just the core information for the months that are missing. A variant of this approach that could be even more cost- effective would be to tinker slightly with the interviewing scheme. For in- stance, under design B. the interviewing scheme could be altered as follows (see the column labeled "Begin in March" in Table 4-3 for illustration): · start interviewing in March instead of February of year 1, which reduces the amount of unneeded data that is collected for the prior year and has the result that two of the four rotation groups will have complete data for all 4 years (the last interviews for the third and fourth groups will occur in January and February, respectively, following year 4~; · for the fourth rotation group, which will not have its first interview until June of year 1, obtain 5 rather than 4 months of data, so that this group has complete data for the first calendar year;24 · use a centralized computer-assisted telephone interviewing (CATI) procedure (or have the regular interviewers phone from home) in January- February following year 4 to collect just the core data that are missing for the first and second rotation groups (i.e., November-December income for the first group and December income for the second). Assuming that monthly rotation groups continue to be used, the collec- tion of complete data for the last calendar year of each panel is a complicat- ing factor in each of the alternative designs. However, we believe that a cost-effective solution can be found. Recommendations After consideration of the pros and cons of alternative designs for SIPP from the standpoint of data quality and utility within the assumed budget constraint of 160,000 interviews per year, we recommend that the Census Bureau discontinue the current SIPP design in favor of implementing design B. (We discuss issues of malting the transition from the current design to our recommended design in Chapter 5.) Recommendation 4-1: SIPP should be redesigned as an ongoing panel survey in which each panel lasts for 4 years and has 12 4- month interviews, with a new pane} introduced every 2 years. The sample size for each panel should be increased over that for the current design. 24The complication arising from having different reference period lengths for different rotation groups should be manageable with a CAPI mode of data collection, as is planned for SIPP. Also, the extension of the first-wave reference period for the fourth rotation group to ~ months may not introduce much additional reporting error, given that the period starts at the beginning of the year a well-identified reference point for many people.
118 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION We believe design B represents the best tradeoff among, the design elements of number of interviews, reference period length, overall panel length, frequency of introduction of new panels, and sample size. In beef, this design: · retains the 4-month reference period length, which may be critical to the SIPP goal of providing high-quality intrayear data on income and pro- gram participation; · extends the panel length to 4 years, which provides additional time for observation of income and program dynamics (having 12 instead of 8 interviews also affords the opportunity for additional topical modules; alter- nanvely, some interviews could forgo topical modules in order to reduce respondent burden);25 · reduces the frequency of introduction of new panels to every 2 years, which relieves the pressures on the Census Bureau of dealing with three panels each year-one of them new without, we believe, adding maten- ally to the biases in cross-sectional estimates; and · increases the sample size for cross-sectional and longitudinal analy- sis: the initial sample of 26,700 households per panel is double the size of the 1985-1989 SIPP panels and one-third more than the 20,000-household size originally planned for SIPP. We heard strong arguments for extending the panel length and increas- ing the sample size even more than in design B. Certainly, increased panel length would make SIPP even more useful for policy-relevant analyses of income arid program dynamics. However, designs C and D achieve an expansion in panel length (and sample size) partly by extending the refer- ence period from 4 to 6 months, a change that we are not willing to endorse without further research. Also, these designs reduce the frequency of intro- duction of new panels more than may be desirable, given the need to main- tain the quality of cross-sectional estimates. Again, we argue that design B. on the evidence available to date, represents the best tradeoff among com- peting design elements. However, because extending the reference period length would provide the opportunity for longer and also larger panels, we believe that research on recall effects should be a high priority for the Census Bureau.26 We urge the Bureau to conduct research on recall period length so that informa- tion becomes available on a timely basis to consider further design changes to SIPP even before the next major scheduled 10-year redesign. 25Chapter 3 provides suggestions of how the current topical modules might fit into the new design. 26Particularly with longer panels, efforts to improve weight adjustments for longitudinal and cross-sectional estimates should also be a pnonty; see Chapter 7.
SURVEY DESIGN Recommendation 4-2: The Census Bureau should conduct re- search on the data quality effects of 6-month versus 4-month reference periods in SIPP so that information is available to consider other possible design changes at a later date, including the possibility of further extending the length of SIPP panels beyond 4 years. 119 We believe that the desirability of evening out the workload of the Census Bureau field staff and of keeping each interview as close as possible to the reference period argues for retaining a monthly rotation group scheme. However, the Census Bureau will need to consider the pros and cons of alternative ways to obtain complete data for the fourth calendar year of each panel for the affected rotation groups. It will be important to determine a cost-effective solution in order to enable SIPP to provide a reliable time series of cross-sectional estimates of income and program participation. We note that the Census Bureau will have 4 years to work out a solution after the new design is implemented and thereafter will only have to address the problem every 2 years. In addition, we are concerned that a rigid procedure for closing off efforts to follow up hard-to-track cases at the end of each month may con- tribute to attrition. In general, we believe that the Census Bureau should investigate ways to reduce attrition on the part of mover households, which might include allowing interviewers additional days beyond the end of each month for follow-up, assigning all hard-to-track cases to a small group of specially trained field staff, or other means. Recommen~on 4-3: The use of a monthly rotation group structure should be retained for SIPP. The Census Bureau should con- sider cost-effective means to obtain the core data for the last calendar year of each panel that will otherwise be missing for some months for some groups. The Census Bureau should also investigate ways to minimize the loss of mover households that may result in part from the closeout of follow-up at the end of each month. O1TERSAMPLING IN SIPP As noted above, the sample for each SIPP panel is designed to cover the population in the 50 states and the District of Columbia, excluding only inmates of institutions and those members of the armed forces living on post without their families. The design is a multistage, clustered probability sample that, with the exception of the 1990 panel, does not oversample specific population groups.
120 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION The first stage in the sampling process for SIPP (as for the March CPS and other household surveys conducted by the Census Bureau) is to use decennial census data to divide the entire United States into primary sam- pling units (PSUs) of larger counties and independent cities and groups of smaller counties. The larger PSUs are then selected with certainty for the sample; smaller PSUs are grouped into strata and subsampled (174 PSUs were selected for the 1984 SIPP panel and 230 for subsequent panels). The final stages in the sampling process are to obtain addresses in each sampled PSU and select clusters of two to four households for interviewing. The addresses represent a combination of decennial census addresses and ad- dresses that are obtained through field canvasses. The latter include ad- dresses in areas of new housing construction and in areas for which the census address list was incomplete. The 1970 and 1980 censuses formed the basis of the sample design and selection of census addresses for the 1984 and 1985-1994 SIPP panels, respectively (see Jabine, King, and Petroni [l990:Ch. 3] for additional information). The Census Bureau is currently developing a new sample design for SIPP, based on the 1990 census, that will be implemented beginning with the 1995 panel. The necessary research has been completed to identify and select the PSUs, and work is proceeding on other aspects of implementation. A new feature of the design will be a provision to oversimple low- income households (see Singh, 1991~. This change is at the behest of SIPP users. In 1988-1989, the Census Bureau held several meetings with data users who were concerned about the effects of sample size reductions in SIPP due to budget cuts. Users expressed an interest in a larger sample size for a number of subgroups' including (in priority order) low-income people, the elderly, blacks, Hispanics, and the disabled. Several options for oversampling were discussed. Given budget constraints, it became apparent that it would be extremely difficult to implement an oversampling scheme in SIPP prior to the 1995 redesign. To help users in the meantime, the Census Bureau decided to curtail the 1988 and 1989 panels (to six and three waves, respectively), in order to have funds to field a larger sample for the 1990 panel, including a supplemental sample that was continued from the 1989 panel (see section above on the current SIPP design). We generally support the goal of oversampling low-income groups in SIPP, which accords with the survey's focus on people who are economi- cally at risk However, we believe that the Census Bureau's scheme for the 1995 redesign (see below) is not likely to be as effective as it is projected to be in achieving this goal. We present several alternative means of oversampling that we believe the Census Bureau should explore.
SURVEY DESIGN 121 Using the 1990 Census for Oversampling in SIPP In planning for the 1995 sample redesign, Census Bureau staff conducted research on methods for obtaining a larger sample for the low-income popu- lation, defined as households with annual income below 150 percent of the poverty threshold. The research also investigated ways to minimize the increase in the variance of estimates for people aged 55 and older that would be expected to result from oversampling the poor and near-poor (given the lower poverty rate for older than for younger people). The Census Bureau decided to adopt a methodology from-Waksberg (1973), which creates two strata within each PSU. The first stratum has a high concentration of the group of interest and is oversampled relative to the second stratum, which has a low concentration of the group of interest. For the SIPP redesign, the 1990 census address list within each PSU will be divided into strata of low-income and higher income households. For households in the 1990 census list that answered the long-form questionnaire (about one-sixth of the total), the determination of income above or below 150 percent of poverty will be made directly. For households that answered the short forte, proxy characteristics will be used to make the classification: specifically, the low-income stratum will include female-headed households with children under 18; low-rent households in central cities of metropoli- tan statistical areas; black and Hispanic households in central cities; and black and Hispanic households in which the head is under age 18 or over age 64. For those blocks of the PSU for which there is no complete census address list (the area frame portion of the sample), the classification will be made using aggregate census information on the proportion of the popula- tion below 150 percent of poverty in each block. The low-income and higher income strata in each PSU will be sampled at higher and lower rates so that an oversample of households in the low-income strata is obtained. The extent of oversampling will be restricted by the requirement that the sampling error of estimates for persons aged 55 and older not increase by more than 5 percent. When the new sample design is introduced in 1995, it is expected that the census address portion of the sample will constitute about 70 percent of the total and the area frame portion about 20 percent. The remaining 10 percent will represent addresses of new construction, for which no oversampling will be performed; obviously, over the course of a 10-year period, this category will grow as a proportion of the total. Moreover, one can confi- dently expect that the efficiency of the design will decline from what would have obtained in 1990 because of the mobility of the population: for ex- nmple, by 1995, 1998, or 2003, a low-income household may occupy a sample address that was drawn from the higher income stratum and vice
122 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION versa. Also, even when the same household is present in 1995 or 1998 as in 1990, the household may have changed classification from low to higher income or vice versa. The question is how great a deterioration in the efficiency of the design will occur over time. The Census Bureau conducted research with data from the 1980 census for 27 PSUs to determine the extent of the gains that could be expected from oversampling low-income households in the 1990 census address por- tion of the sample, assuming that the design was implemented immediately after the census. The results (Singh, 1991:Table 1) showed gains (i.e., decreases in sampling error) for many subgroups of interest to users, such as poor blacks and Hispanics. The Census Bureau also conducted research with data from the American Housing Survey (AHS) on the effects of time on the efficiency of the design and estimated very little increase in sampling error 5 to 15 years after the census date (Singh, l991:Table 3~. The Bureau estimated somewhat higher but still relatively low increases in sampling error due to uniform sampling in the new construction frame and the assump- tion that stratification will be less efficient at the block level in the area frame compared with the census address frame (Singh, l991:Table 4) 27 Although this research appears encouraging about the proposed oversampling scheme, we remain skeptical. There were many limitations to the research, such as the use of only 27 PSUs from just a few states in the 1980 census analysis and the inability of the analysts to replicate fully the proposed design with the AHS data (households were classified on the basis of proxy characteristics rather than on the basis of their income-to-poverty ratio). We believe that further research on the extent to which the household pov- erty classification assigned to an address in the census predicts the poverty classification of the household at that address ~ to 15 years later is needed to support the Census Bureau's proposed oversampling scheme. For ex- ample, it could be useful to conduct research on the extent to which the household poverty classification of addresses in the 1985 SIPP panel corre- sponds to the 1980 census classification.28 There is also no opportunity to change any aspect of the design because the Census Bureau plans to draw 10 years' worth of sample for SIPP (and other household surveys) at the same time. Hence, the samples for all of 27Chu et al. (1989:2.9-2.11) found that oversampling geographic areas with relatively high percentages of low-income households was not very successful in reducing the sampling errors for estimates of the poverty population in the National Health and Nutrition Examination Surveys. They attributed this outcome to the fact that many poor people live in nonpoverty areas and vice versa. 28We understand that a match of 1985 SIPP and 1980 census address lists is not likely to be operationally feasible, and we strongly urge the Bureau to take steps to ensure that it will be possible to perform a match of 1995 SIPP and 1990 census lists.
SURVEY DESIGN 123 the panels from 1995 to 2005 will be drawn in the same way, using the same characteristics to determine the two strata within each PSU. The only exception is that provision has been made to jettison the oversampling and implement a uniform sampling rate for any SIPP panel in the 1995-2005 period if that is later viewed as desirable. Even assuming the benefits of the proposed scheme, we believe that there are some technical ways in which it could be improved. For example, if the object is to oversample low-income households in SIPP, then the census address portion of the sample could be drawn exclusively from the long-form respondents to the 1990 census, which represent a-very large fraction (1 in 6) of the total population. Selection of PSUs on the basis of poverty-related characteristics could also be beneficial. We are pleased that the Census Bureau decided to adopt the same oversampling rate across all PSUs, instead of determining PSU-specific rates as in the original plan that we reviewed. The latter procedure would have allowed the Census Bureau to better control the size of the workload across PSUs, but it would have resulted in variations in the weights for addresses sampled within each stratum low-income or higher income-across PSUs. (Such weight varia- tions are likely to reduce the sampling error gains.) Also, those PSUs with the highest percentages of low-income households would have had propor- tionately less oversampling of the low-income stratum compared with wealthier PSUs. More broadly, we urge the Census Bureau (and the user community) to be clear about the target population in considering the use of oversampling in SIPP. For the redesign, the Census Bureau is essentially defining a cohort of low-income people on the basis of their previous year's household income-to-poverty ratio. However, many people with low incomes at wave 1 will move into a higher income category over the life of a SIPP panel and vice versa (see Short and Littman, 1990; Short and Shea, 1991~. Instead of a larger sample for a low-income cohort at the start of a panel, it may be that users would prefer to have a larger sample for people who are at risk of experiencing a spell of low income at any time during a panel or at risk of experiencing a long spell of low income. Different oversampling criteria would be required, depending on the definition of the target population: for example, a combination of variables, such as family type, ethnicity, and previous year's low-income status, may be a better predictor of long-term economic disadvantage than the latter variable alone. Screening as Another Method of Oversampling An alternative method for oversampling the low-income population in SIPP is to use a screening interview close to the time when a new panel is to be introduced. This approach could be used to refine the proposed 1990 cen
24 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION sue-based approach (if larger-than-needed samples were drawn from the census list) or serve as a substitute for it. The advantage of screening is that it provides information on which to draw a sample that is close in time to the introduction of the survey and thereby is likely to permit more effective oversampling since much less mobility or change in classification will have occurred in the interim. Also, screening offers flexibility the criteria for sampling can be changed as needed (e.g., some panels could oversample minorities instead of low-in- come households). In addition, screening can be applied uniformly to the entire sample, instead of using different procedures for the census long- fonn respondents, census short-form respondents, area frame address, and new construction address segments of the sample. In the context of oversampling low-income households while not worsening the estimates for older people, screening should make it possible to develop a more efficient approach to this problem (e.g., also oversampling elderly higher income households). On the negative side, screening imposes the costs of conducting an interview for a larger number of households than will be selected for the survey, which may necessitate a reduction in the overall sample size. It may also add costs by lessening the ability of the Census Bureau to equalize interviewers' workloads across PSUs (e.g., the screening might result in sample sizes that overtax some interviewers while underemploying others, with little time to make adjustments before the start of the survey). However, these costs must be viewed in the context of the entire sur- vey, which, in the case of SIPP, will amount to 12 interviews under the proposed redesign. There are also ways to reduce costs. It may be possible to conduct much of the screening using a centralized CATI system that eliminates interviewer travel costs.29 Another way to reduce costs is to treat the screening interview as wave 1 of a SIPP panel instead of as an added interview. In a CAPI environment, the sampling criteria could be built into the interview so that the full wave 1 interview could be adminis- tered on the spot to those households selected for the sample. Another problem with screening, when the purpose is to identify house- holds on the basis of income or poverty, concerns measurement error. Screening interviews are typically short in order to reduce costs. However, studies have shown that respondents tend to underestimate their income in response to brief, general questions (e.g., Chu et al., 1989; Moeller and Mathiowetz, 1990), so that a short screening questionnaire may erroneously classify higher income households as low income, thereby reducing the gains from the oversampling. In addition, some low-income households may be falsely 29Telephone mlmbers would be obtained from directories that are organized by address. Some personal screening would also be required for addresses with no telephones or with an unlisted number.
SURVEY DESIGN 125 classified as higher income based on their responses to the screening ques- tionnaire. Such households will thus not be oversampled. There is also the issue, noted above, of defining the target population-for example, people at risk of a long-term spell of low-income during a SIPP panel rather than a low-income cohort and defining appropriate variables to use in the screen- ing questionnaire. Despite the problems of a screening approach, we be- lieve that the potential benefits in terms of a more efficient sample design and greater flexibility merit a careful examination of its cost-effectiveness for oversampling the low-income population in SIPP. Recommendation 4-4: The Census Bureau should investigate alternative methods of oversampling the low-income population in SIPP, including the use of screening interviews as ~ possible complement to or substitute for an approach based on using information from the 1990 census. Increasing Sample Size by Extending Panel Length The Census Bureau's census-based plan and the use of screening do not exhaust the possible approaches for oversampling low-income households or other subgroups in SIPP. Another possibility is to extend the length of one or more SIPP panels for subgroups of interest. This strategy both provides additional longitudinal information for the subsampled cases and makes it possible to treat them as an addition to the sample for the next panel (see David, 1985a). This approach was followed in the 1990 panel, for which the sample includes households from the 1989 panel that were headed by blacks, Hispanics, and female single parents as of wave 1 of that panel. Users have often expressed interest in periodically extending the length of SIPP panels for people who may be at economic risk because of experi- encing a divorce or job loss or for people who benefit from programs or have certain demographic characteristics (e.g., single parents). We are not now recommending that such an approach be built into SIPP because we believe that the Census Bureau confronts a very large agenda in implement- ing the proposed redesign of 4-year panels introduced every 2 years to- gether with computer-assisted interviewing and an improved database man- agement system (see Chapter 5~. However, we do believe that the concept has merit and should be an option for the future. Hence, we urge the Census Bureau to take the steps that are necessary to permit this and related options to be considered for SIPP at some future date. (A related option, which would add longitudinal information although not necessarily increase the sample size for the next SIPP panel, would be to return at annual or longer intervals to selected cases.)
26 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION One such step involves informed consent. Respondents need to be informed at the outset that there is a possibility that they may be asked to answer further questions after the closeout of their SIPP panel. If such consent is not sought, then, under current views about the obligations of statistical agencies to their respondents, it would probably not be possible to later impose this additional burden on them. Another step involves setting in place procedures for tracking respon- dents after the end of a panel. Particularly if it appears desirable to revisit the subsampled groups at less frequent intervals (e.g., yearly), the Census Bureau will need to have good procedures developed to keep in touch with them so as to minimize sample loss. Recommendation 4-~: The Census Bureau should take steps to ensure that it will be possible to extend the length of SIPP pan- els for selected subgroups of interest or to follow them up at a later date, should such options be desired to obtain increased sample size and longitudinal information. Multiple-Frame Samples Yet another way to obtain an additional sample for subgroups of interest in SIPP is to develop multiple-frame samples, that is, samples of households together with cases that are drawn from one or more types of administrative records for example, program records, tax records, or employer records.30 Augmenting a household sample with cases from administrative records can offer considerable benefits. First of all, such a strategy may be a very efficient means of oversampling such subgroups as program recipients. Also, providing that confidentiality and data access issues are resolved, additional records data could be obtained for the administrative cases, not only con- current with but also preceding and following the time span of the survey interviews. Analysis of the relationships of the records data and survey responses for the administrative cases could serve a number of useful pur- poses. For example, in a multiple-frame sample of program recipients, the records information could provide the basis for imputing program character- istics to recipients in the household sample for use in improved policy models for program analysis and simulation of program changes. The drawback to the multiple-frame approach for increasing sample size and information for subgroups of interest is that a number of problems impede its ready implementation. Many of the problems are operational in 30A sample of households together with cases drawn from one source of records is termed a dual-frame sample.
SURVEY DESIGN 127 nature.3i Permission for the records must be obtained, which can be time- consuming and difficult to achieve. In the case of programs that are state administered (e.g., AFDC and food stamps), there are differences across states in access rules and in the extent to which the records are appropri- ately computerized so that access is operationally feasible. At this time, only a small number of states have good computerized records for such programs as AFDC; hence, it would not be possible to develop a national multiple-frame sample for these programs. Also, the Census Bureau itself may not be able to have access to an entire administrative file for purposes of sample selection, in which case it would have to rely on the responsible agency's ability to properly implement a specified sampling procedure. Fi- nally, the addresses in the sampled records may not be current, in which case a tracing operation, with likely problems of its own, would be neces- sary (see Logan, Kasprzyk, and Cavanaugh, 1988~. A multiple-frame approach poses technical difficulties as well, includ- ing the determination of appropriate weights for the combined sample. Tak- ing the simple case of a dual-frame sample, some fraction of the household sample will have a probability of selection in the sample drawn from records, and all members of the records sample will have a probability of selection for the household sample. Consequently, it is necessary to develop weight- ing adjustments to compensate for dual selection probabilities, and this requires identifying those members of the household sample who are in- cluded in the administrative frame. One way to identify these members is to rely on respondents' reports of their status at the time of drawing the administrative sample. (For example, in the case of a dual-frame sample including SSI cases drawn the August before the start of a SIPP panel, the questionnaire would ask about receipt of SSI in the preceding August.) A possibly more reliable approach is to match the household sample members with the administrative frame. However, this procedure adds a step to the data processing that could cause delays in the release of data files and reports. (The 1979 ISDP multiple-frame sample initiative came to grief on this very point usable, fully weighted data files were not completed before the program ended; see Kasprzyk [19833.) In addition, there are technical issues to resolve with regard to the type of sample to draw from administrative records (assuming that the sample can be properly selected.) In the case of a program such as AFDC, one needs to decide whether a cross-section sample or a sample of new entrants is appropriate. A cross-sectional sample will overrepresent longer term 31Record-check studies, including forward record checks and full record checks, face many of the some operational problems: see, for example, the discussion in Marquis and Moore (199Ob) of the difficulties in carrying out the SIPP record-check study, which obtained records for eight federal and state programs.
28 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION recipients. If new entrants are sampled, the decision must be made as to the time period or window during which cases are eligible for selection (a month, year, or other period). Ideally, for program analysis, one would like to sample people who are eligible for Programs. not lust Participants, but an-- r--r~ -I---- --- I-- ~ ---- J--- r there is no record system available to do this. The usefulness of a multiple-frame or dual-frame sample for SIPP de- pends very much on user interest in particular population groups, such as program recipients. We suggest that a decision to adopt this means of oversampling, particularly in light of the operational and technical difficul- ties it would pose, should be contingent on the support and cooperation of an interested agency. We encourage the Census Bureau staff to keep up to date on the methodology of multiple-frame samples, so that SIPP can be responsive to requests from agencies that want to obtain a larger sample size and information for a particular population by adding a component to the SIPP sample that is drawn from their records.32 FOLLOWING RULES At present, SIPP follows original sample adults that is, all people aged 15 and older who resided in an interviewed household in wave 1 for the life of a panel or until they leave the universe or drop out of the survey. SIPP also keeps track of original sample adults who enter institutions and inter- views them if they return to a household setting.33 Adults who join the household of an original sample adult after wave 1 are followed only so long as they continue to reside with an original sample member. Similarly, children, regardless of whether they were present in wave 1, are followed only so long as they reside with an original sample adult. We believe that the utility of SIPP for important research and policy concerns would be enhanced by changing the following rules for two groups all children and children and adults who enter institutions. Many users expressed great interest in having more information with which to analyze children's changing circumstances (see Chapter 2~. Fewer arid fewer children have a stable family or economic situation throughout their childhood, and more and more children are experiencing economic and social distress. Extending the length of SIPP panels will make the survey 32The report of the ISDP special frames study (Logan, Kasprzyk, and Cavanaugh, 1988), which was designed to test the feasibility of sampling and locating respondents from adminis- trative records, provides pertinent information for consideration of multiple-frame samples in SIPP; see also Kasprzyk (1983). 33Tracking the institutionalized was not originally intended for SIPP but was initiated in May 1985 for the 1984 panel and in October 1985 for the 1985 panel (Jean and McArthur, 1987).
SURVEY DESIGN 129 more useful for analyzing children's circumstances over time. However, the current following rules preclude using the survey to obtain a complete picture of children's family dynamics. For example, in wave 1 of the 1984 panel, 24 percent of children under 15 lived with only one parent or with another relative (McArthur, Short, and Bianchi, 1986:Table 6~. Any of these children who subsequently moved in with an adult not part of the original sample (e.g., the other parent or another relative) would not be tracked under the current following rules. Similarly lost to follow-up would be children who went to live with, say, a grandparent following a divorce, or children born to a marriage between a sample and nonsample-adult who stayed with the nonsample parent following a marital separation.34 The numbers of such events will increase with longer panels, but the current following rules will preclude analysis of them. We urge the Census Bureau to treat all children present in interviewed households at wave 1 together with children born subsequently to original sample mothers as original sample members who are followed throughout the life of a panel. When original sample children move into a household of nonsample members, information would be obtained about them and about other members of their new house- hold. There is also user interest in learning about both children and adults who become institutionalized. SIPP is not the appropriate vehicle to pro- vide information about the institutionalized population as such, but because the survey follows people over time, it naturally provides a sample of en- trants to all types of institutions (e.g., mental health treatment facilities, nursing homes, prisons). Extending the current practice of following origi- nal sample members who enter institutions to include children and, in addi- tion, obtaining at least some limited information for them would enhance the usefulness of SIPP for analysis of socioeconomic well-being in the United States. For example, for fuller analysis of some government pro- grams, it is important to include institutionalized people who can still re- ceive benefits under such programs as social security and SSI.3s It would also be useful to know about other sources of income for institutionalized people, such as private pensions, asset income, and transfers from relatives. 340f children present in all 8 waves of the 1984 SIPP panel who lived with both parents in wave 1, 7 percent had experienced a change in the marital status of their parents by the end of the panel (Bianchi and McArthur, l991:Table A). This is likely a lower bound estimate of children at risk of a marital separation to the extent that the weights do not adequately adjust for the higher attrition rates of children in nonintact families (McArthur, Short, and Bianchi, 1986:Table 6). 35Coder (1988:Table 9) found that 4 percent of SSI recipients in the first month of the 1984 SIPP panel had entered an institution by the end of the panel, as was also true for 3 percent of recipients of social security and veterans' payments and 1 percent of private pension recipi ents. l
130 THE SURVEY OF INCOA'IE AND PROGRAM PARTICIPATION These data would be useful for analysis in their own right and also in conjunction with the data for the people in the household they left. We do not offer detailed suggestions about the kinds of data to collect for original sample members who enter institutions during the course of a SIPP panel, nor even about the preferred data collection mode (e.g., some items might best be obtained from the institution and others from the family). However, we do urge the Census Bureau to investigate the needs of users for information about institutionalized persons that fall within SIPP's goal of pro- viding improved data on economic resources arid assistance programs, particu- larly for people and families who may be economically at risk. Recommendation 4-6: SIPP panels should treat all children who reside in interviewed households at the first wave and also chil- dren born during the course of a panel to original sample moth- ers as original sample members, who are followed if they move into households without an original sample adult. SIPP panels should also continue to follow and collect data for both original sample adults and children if they move into institutions.