The study panel is charged with evaluating, to the extent possible, the impact of the Survey of Income and Program Participation (SIPP) redesign on the burden to respondents participating in this longitudinal survey. Respondent burden is a concept that is difficult to define and even more difficult to measure. In this chapter, the report looks first at the literature on the components of burden and how those components might be measured. It then compares the redesigned SIPP with its predecessor design with respect to the measurable components of burden and provides recommendations to improve measurement of burden in future SIPP implementations.
In his seminal paper, Bradburn (1978) noted, “The topic of respondent burden is not a neat, clearly defined topic about which there is an abundance of literature.” While much work on respondent burden has been conducted in the following decades, respondent burden remains an underdeveloped area of research. As Jones (2012, p. 1) observed, this is still “not a straightforward area to discuss, measure, and manage.” More recently, Yan and colleagues (2016, p. 3) stated that “conceptualizations and measures of burden are still underdeveloped and, as a result, findings from empirical research in this area remain equivocal.”
Kenneth Darga created one useful topology of the components of respondent burden in his discussion at a workshop, reported in Benefits, Burdens, and Prospects of the American Community Survey (National Research Council, 2013). He suggested that one component of respondent
burden is the demands that the survey puts on the respondent. The variables associated with this component can generally be measured. They include such things as length of interview, number of different times a respondent is interviewed, and difficulty the respondent has in retrieving and reporting the requested information. He suggested that the second major component of respondent burden is the perceived cost to the respondent of revealing personal information. In an era of heightened ethnic/minority sensitivities, identity theft, and concerns about government intrusiveness, it may be “impossible for the Census Bureau to fully anticipate which data items might be deemed sensitive by which particular individual” (National Research Council, 2013, p. 157). Thus, this major component of respondent burden is particularly difficult to measure.
In another schema, Yan and colleagues (2016) divide respondent burden into three broad categories. Their first category comprises “properties of surveys or tasks that are believed to impose respondent burden.” They list as examples of this category the length of interview and the difficulty of the response task. For longitudinal panel surveys the number of interviews or rounds would be an additional component in this category. In the second category are respondents’ attitudes and beliefs about surveys. These include respondents’ judgments about the importance of the survey and its topic, their interest in the survey/topic, and their perception of time and effort expended. The authors noted that these topics may act as “mediators” that affect the “perception of burden,” but they are not direct measures of burden. Their third category covers “burden as perceived by respondents through respondent behaviors.” Examples include a respondent’s willingness to do a re-interview and a feeling of being exhausted by an interview process.
A main takeaway from this work by Bradburn (1978), Darga (National Research Council, 2013), and Yan and colleagues (2016), as well as the work of others (Ampt, 2003; Apodaca et al., 1998; Jones, 2012; Kalton and Citro, 1995; Schoeni et al., 2013), is that respondent burden combines objective and subjective elements. There is general agreement that the key objective indicators are (1) length of interviews, (2) number of interviews (if a panel design is employed), and (3) difficulty for the respondent in providing accurate answers to the questions. This chapter discusses each of these indicators in turn, below.
Length of Interview
Length of interview is an objective measure that can be documented by various items such as mean time to complete an interview and number of items asked. Longer surveys are often thought of as more burdensome than shorter ones. However, individual surveys of the same length vary greatly
in how burdensome they are judged to be (Burchell and Marsh, 1992). Engaged respondents and those not pressed for time rate a given survey as less burdensome than those who find the same survey to be uninteresting (or worse) and those who are busy or being deterred from doing preferred activities. There is clear evidence for the Boys Town effect (“He ain’t heavy, he’s my brother”) in which the perceived value is seen as mitigating the objective burden. In addition, characteristics of the survey itself affect the perception of burden. In an experimental comparison on the same instrument being administered face-to-face or on the telephone, 35.5 percent of those interviewed on the phone said it was too long, versus only 10.1 percent of those giving responses in person (Groves and Kahn, 1979). Finally, characteristics of the interviewer (when there is an interviewer) affect respondent burden. It is the study panel’s view that an interviewer who establishes good rapport can notably reduce the subjective respondent burden.
Number of Interviews
As a component of objective burden, the number of survey interviews has been studied less extensively than length of a particular survey. In general, this component only applies to panel surveys, and it is difficult to judge the cumulative response burden associated with additional interviews. Still, the available research (Kalton and Citro, 1995; Schoeni et al., 2013) indicates that respondent burden does both objectively and subjectively increase with the number of waves in which a respondent is asked to participate.
Difficulty of the Interview Process
Difficulty can be broken down into cognitive requirements to complete the task and the sensitivity or social desirability factors associated with the interview. Cognitive aspects include both the difficulty of understanding what information the questions are seeking and how difficult it is to process the request and supply the desired information.
Cognitive difficulty is determined in part by the substance of the desired information. For example, information requests that require precise details such as exact dates and counts are inherently more cognitively difficult than items asking for only general and approximate information. For a given set of requested information, cognitive burden may be alleviated (or aggravated) by the wording and format of the questions. For example, questions using confusing terms or technical jargon will be more difficult than those that are clear and easy to follow. Cognitive difficulty also varies
by the characteristics of respondents. For example, better-educated respondents may find it easier to understand and process certain technical items. Other factors related to cognitive difficulty are the respondent’s language proficiency (e.g., whether the items are in a person’s main or strongest language); literacy (if a self-completion questionnaire or other written material is utilized); physical/sensory traits such as deficits in eyesight, color vision, or hearing; and memory/dementia issues (sometimes related to aging and/or health status). Cognitive difficulty is also related to how the substance of items relates to respondents. For example, some people are more numerically adept and thus better at both reporting amounts and making mathematical calculations (e.g., amounts per unit of time or average amounts).
Sensitivity and Social Desirability of Questions
Sensitivity and social desirability factors influence whether respondents are willing to report, and report truthfully, about information (especially behaviors) that may be seen as reflecting negatively on themselves. This typically involves topics that are deemed to be personal or private (e.g., income) or topics that reflect statuses and behaviors that are illegal, unethical, or otherwise generally disapproved of by society (e.g., illegal drug use, committing other crimes, excessive alcohol consumption, deviant or uncommon sexual activity, cheating on tests). Such items increase respondent burden in several ways. First, they may make many people think about negative experiences that made them feel threatened or uncomfortable and thus increase the burden of the interview. Second, because many people are reluctant to share personal or negative information, they need to decide how to handle such items and may decide to refuse to answer them or to give an untruthful response (e.g., denying drug use or inebriation). Such response decisions may take more time and effort than a straightforward, truthful response would. Third, people may have lingering concerns about either having divulged unfavorable information or about lying to conceal that information. Such concerns may be perceived by the respondent as increasing the burden of responding.
Respondent burden related to sensitivity or social desirability has both objective and subjective elements. On the objective side, certain information is generally considered to be private and/or negative by society (Bradburn, 1978). That is, society has made a value judgment that the requested information is private, negative, or socially undesirable. Such items often become sensitive only if a respondent has engaged in the negative or suspect activity. For example, if a respondent has not used illegal drugs or committed an aggravated assault, then she can simply answer questions about such activities without any particular concern about the sensitivity of the questions. A subjective component also exists because not all judgments as to what
is private or negative are universally held in any given society. Likewise, a subjective component is present in how a particular respondent decides to handle a request for such information.
Guidance from Literature
Following from the preceding discussion, various steps can be taken to reduce both the objective and subjective components of respondent burden. These would generally include shortening questionnaires, reducing the number of waves in panel studies, making items simpler to both understand and answer, training interviewers in gaining and maintaining rapport, and customizing the interview to handle differences across respondents in cognitive ability. It has also been suggested that offering an incentive to respondents may reduce respondent burden. There is clear evidence that incentives increase response rates, but there is a difference of opinion as to whether they reduce respondent burden. (See discussion of incentives in Chapter 5, section “Incentive Experiments in the 2014 Redesigned SIPP.”)
As the above literature review shows, respondent burden is a concept that has a number of dimensions, both objective and subjective. Focusing on objective measures only, the study panel attempted to assess whether there was an overall change in respondent burden based on SIPP redesign. The panel examined length of the interview, number of interviews, the difficulty of the interview process, nonresponse rates, and sample loss rates between waves.
An objective component of respondent burden is the amount of time an interview takes to complete. The average interview times for waves 1 and 2 of the 2014 SIPP panel are given in Table 8-1. Interview times for wave 3 were not available for this report. In wave 1, the average length of the interview for a household (survey unit) was about 104 minutes. The table also provides mean interview times for single-person (adult 15+ years) households, households with two adults, and households with more than two adults. The single-person households’ average length of interview is almost 68 minutes. An additional adult in the household increased the average length of interview for a household to about 102 minutes. Households with more than two members experienced interviews of about 134 minutes, on average.
TABLE 8-1 Mean Interview Times, Completed In-Person Interviews, 2014 Survey of Income and Program Participation Panel for Calendar Year 2013 (wave 1) and Calendar Year 2014 (wave 2), by Household Size
|Wave 1||Wave 2|
|Households with >2 Members||9,055||133.53||5,470||118.37|
- Interview duration was calculated using audit trail files associated with each household.
- The clock starts as soon as the interviewer enters the instrument and stops when the interviewer exits the instrument each day. If a household is interviewed across several days, each day’s interview duration is summed. If there are long pauses (> 15 minutes) during the interview, the clock is stopped at the last active time/date and then restarted at the time/date once the interview proceeds.
- This table excludes households with interview duration less than 15 minutes. Four households were excluded because there is not an audit trail associated with these cases.
SOURCE: Panel generated with data from the U.S. Census Bureau.
Moving to wave 2, the average length of interviews is less than the average length of interviews for wave 1. Average length of interview for all households was about 92 minutes, approximately 12 minutes shorter than the wave 1 interviews. The reduction in interview time was across the board: single-person households averaged 61 minutes (7 minutes shorter than wave 1); two-person households averaged 90 minutes (12 minutes shorter); and households with more than two persons averaged 118 minutes (16 minutes shorter). There are several possible reasons for the decrease from wave 1 to wave 2. The most likely contributors are (1) household respondents were familiar with the survey on wave 2 and therefore needed less background information and had fewer questions, and (2) certain data (such as roster information) were preloaded into the wave 2 instrument, so less time was required to confirm this information than to gather it initially in wave 1. Another possible contributor is a possible change in the interviewed population between wave 1 and wave 2. Were the wave 1 respondents with longer interview times less likely to agree to a wave 2 interview? This is an open question, and the panel did not have the necessary data to investigate whether it may have been a factor.
The 2008 SIPP panel required three interviews to obtain 12 months of income data for each person and household. Were these 4-month recall interviews shorter than the single annual interview used in the redesigned
SIPP to get data for the same number of months? Unfortunately, the study panel was unable to make direct comparisons. Census Bureau staff determined that the length-of-interview data for both the 2008 SIPP panel and the 2004 SIPP panel were unusable due to problems with the software. Because there is no information on length of interview for these earlier SIPP panels, there is no way to directly compare interview times across the two designs.
As a proxy, the Federal Register Notices for SIPP panels can provide a sense of the expectations for interview times in each panel. The October 2, 2013, Federal Register Notice for the 2014 SIPP panel projected that “interviews [will] take approximately 60 minutes per adult on average.” This estimate is roughly consistent with the single-person household average length of interview times (68 minutes in wave 1 and 61 minutes in wave 2) that were achieved in the 2014 SIPP. A Federal Register Notice from June 27, 2007, gives the “estimated time for response” to be 30 minutes for interviews in the 2008 panel, suggesting a much shorter interview per adult. However, since three interviews are necessary in the 2008 panel to obtain 12 months of income and labor force data about an adult respondent, a very rough approximation of the time needed to collect 12 months of data is 90 minutes per adult household member.
Number of Interviews
The second component of “objective” respondent burden, which is important for panel surveys, is the number of interviews required to create critical estimates from the panel. In wave 1 of the 2014 SIPP panel, interviewers personally visited each address selected into the SIPP sample. During this one personal visit, the interviewer captured (or sought to capture) 12 months of information.1 In the 2008 SIPP panel, three interviews were used to reduce the recall period to 4 months when obtaining 12 months of data. Thus, the redesigned SIPP requires a single interview (likely longer than each of the three interviews in the previous design) to obtain a year of data, while the older SIPP design required three interviews (each likely shorter than the annual interview) to collect a full year of data. If the panel lengths (number of years for which a panel is in the survey) and other factors remain the same, this indicator (number of interviews) could imply less burden in the redesigned SIPP. However, confirming that implication requires also determining whether respondents perceive three shorter interviews to be more or less burdensome than a single longer questionnaire covering the same reference period.
1 The interviewer also collected data on the months between the end of the 12-month reference period and the interview, which can occur 1 to 5 months after the end of the reference period.
Another complexity in using the number of interviews as a measure of burden is the addition of a supplement survey for the Social Security Administration to obtain data dropped from the 2008 SIPP questionnaire and its topical modules as part of the SIPP redesign. This additional survey, fielded between the first and second waves of the 2014 SIPP panel to satisfy this important SIPP client, added another interview to the overall program.
Difficulty of the Interview Process
As addressed in the literature review, the difficulty of the interview process can be affected by the cognitive requirements of the process and by the sensitivity of the questions. Because the basic subject matter of SIPP did not change in the redesign, the report focuses on the potential change in cognitive requirements.
The redesign changed the recall period for most questions from 4 to 12 months. Accurate recall is a cognitive difficulty in a complex survey like SIPP. Tripling the length of the recall period may require the respondent to think harder or be unsure of the appropriate responses, especially for any change in status—job change, for example—within the longer reference period. This may result in additional perceived burden. The event history calendar (EHC) was incorporated into the redesigned instrument to assist the interviewer in obtaining accurate spells of activity. This instrumentation approach offers a potentially significant advantage over the standard questionnaire approach by allowing respondents and interviewers to move back and forth across programs and activities, reporting events that are linked in time rather than being forced to follow the sequence incorporated in the instrument’s structure. When used in this manner, the EHC has the potential to mitigate the cognitive challenges of an annual recall of certain events. However, the study panel learned that this potential advantage was not used to its maximum potential because interviewers only infrequently used the EHC in this way, and the respondent, of course, does not see the calendar. (See discussion in Chapter 5 about recorded interviews.)
Has the redesign of SIPP changed the cognitive difficulty of responding to this complex survey? There were changes that had the potential to make the process more cognitively difficult, namely the tripling of the reference period. There were changes that had the potential to make the process less cognitively difficult, namely the modification to the wording of some questions and the introduction of the EHC with the instrument. In addition, many questions were dropped from the redesigned survey questionnaire, with potential reduction in overall burden, but a supplement was added to replace some of these questions. Unfortunately, no evidence is available that would enable the panel to understand the respondents’ point of view on these matters and what their perceptions of burden might be.
Nonresponse and Sample Loss Rates
Response rates and breakoffs to interviews can be used as a proxy for measuring respondents’ perception of burden and/or their own willingness to be burdened. Comparisons should be treated with caution because many factors go into a person’s decision to participate or not in a survey. In the case of a longitudinal survey, comparing one design to another based on response rates is problematic at best. With that disclaimer, the report assesses sample loss rates for any added insight into respondent burden.
The SIPP redesign allows one to take several different approaches to the comparison of nonresponse rates across the two SIPP designs. Higher nonresponse rates for the first interview of each panel could be loosely construed as insight into the respondents’ perception of the burden as described in any advance survey materials, as well as initial discussions with the interviewer. Under this interpretation, the higher the nonresponse observed in wave 1, the higher the upfront perception of burden by the respondents. The 2014 SIPP experienced higher nonresponse in the first wave than the earlier design (see Table 7-20 and related discussions in Chapter 7). Wave 1 of the 2008 panel experienced a 19.2 percent nonresponse rate, whereas wave 1 of the 2014 panel had a 29.9 percent nonresponse rate. Six years separate these two panels, and any number of other factors may have contributed to the larger nonresponse rate in 2014, including factors that contributed to a general downward trend of response on most government surveys.
Another approach to using nonresponse rates for insight regarding respondent burden in a longitudinal survey is to compare the sample loss rates for wave 2 across survey designs. Wave 2 nonresponse occurs after the respondent has responded to and experienced one 2014 SIPP interview. The respondent has good awareness of the nature and depth of SIPP questions on socioeconomic topics and the length of the time commitment to complete a questionnaire. The additional sample loss at wave 2 of the 2008 panel was 6.7 percent, while the additional sample loss for the 2014 panel was 18.1 percent. Again, there is about a 6-year difference in the implementation of wave 2 of the 2014 panel, as well as a longer period of time between the wave 1 and wave 2 interviews (12 months versus 4 months for the 2008 panel). Both of these factors may have contributed to more difficulty tracking the previous wave’s respondents who had moved in the interim.2 The wave 2 sample loss rate for the redesigned SIPP is substantially higher than for the 2008 panel.
2Chapter 7 discusses Type D nonresponse, which includes inability to locate. Type D nonresponse increased for the 2014 SIPP wave 2, compared to wave 2 of the 2008 panel. As a share of total nonresponse, however, Type D nonresponse remains much smaller than nonresponse from refusals and noncontacts.
A third way one can attempt to use nonresponse for insight into respondent burden is to compare the sample loss rates after one wave of the 2014 SIPP panel, a wave that provides 12 months of income and socioeconomic information, with the sample loss after three waves (12 months of data) of the 2008 SIPP panel. The sample loss after wave 1 of the 2014 SIPP panel was 29.9 percent. The sample loss at the completion of three waves of the 2008 SIPP panel, 29.0 percent, was essentially identical. A similar sample loss comparison can be made for the cumulative loss after 2 waves of the 2014 SIPP, compared with the cumulative loss after 6 waves of the 2008 SIPP panel, because waves 4, 5, and 6 together provide an additional 12 months of information. The cumulative sample loss from the 2014 SIPP panel on wave 2 was 48.0 percent. After wave 6 of the 2008 panel (after the collection of 2 years of data), the cumulative sample loss was 35.5 percent. Similarly, wave 3 of the 2014 SIPP panel experienced a cumulative sample loss of 58.3 percent, whereas the 2008 panel, after nine interviews, experienced a cumulative sample loss of 39.7 percent
Based on the scenarios above, nonresponse and sample loss suggest that the 2014 panel, with generally higher sample loss rates, could be perceived as more burdensome than the 2008 panel.
According to the literature, respondent burden combines objective and subjective elements. There are no data to quantify the subjective burden perceived by respondents of the redesigned SIPP. The study panel was therefore limited to looking only at the main objective measures: length of interviews, number of interviews, and difficulty of interview. In addition, the panel examined nonresponse rates and sample loss rates at stages in the longitudinal process and summarized the following findings:
Finding 8-1: Interview times could not be adequately compared because of the absence of reliable information for the 2008 panel.
Finding 8-2: The redesigned 2014 panel reduced the number of interviews compared to the old design, which could reduce respondent burden.
Finding 8-3: SIPP is a complex, cognitively difficult survey. Increasing the recall period likely contributes to an increase in perceived burden. The EHC provides the potential to mitigate some, but not all, of this added difficulty. The sample of recorded interviews to which study panel members listened indicated that interviewers did not use the EHC in a way that would maximize its effectiveness.
Finding 8-4: Nonresponse rates and sample loss rates for comparable panel durations are higher for the redesigned SIPP than for the 2008 SIPP panel. Many factors affect response. However, these increases could indicate some increase in perceived burden by respondents, or at least a decrease in their willingness to be burdened.
Our analysis on these objective measures suffered from insufficient data, and no reliable determination is possible.
CONCLUSION 8-1: The panel could not make a determination as to whether the redesign of the Survey of Income and Program Participation affected respondent burden, either positively or negatively.
This page intentionally left blank.