Robert A.Moffitt, Constance F.Citro, and Michele Ver Ploeg
Academic and policy interest in the U.S. welfare system has increased dramatically over the past 15 years, an interest that has accelerated and is currently at an all-time high. Beginning in the late 1980s with welfare reform initiatives in a few states around the country and continuing in the first half of the 1990s as more states made changes in their income support programs, welfare reform culminated at the federal level with the passage of the Personal Responsibility and Work Opportunity Reconciliation Act (PRWORA) in 1996. PRWORA replaced the long-standing federal entitlement program for low-income families and children (Aid to Families with Dependent Children, AFDC) with a program financed by state-administered block grants, the Temporary Assistance for Needy Families (TANF) program. The legislation imposed several new requirements on state TANF programs, including lifetime limits on receipt of benefits, minimum work requirements, and requirements for unmarried teenage parents to reside with an adult and continue their education in order to receive benefits. Otherwise, it allowed states to configure their programs as they see fit, continuing a trend of devolving the design and control of familial assistance programs from the federal government to state governments that began earlier in the 1990s.
The enactment of PRWORA provided the impetus for a large volume of research studies aimed at studying its impact and that of changes in other federal income support programs, such as the Food Stamp Program. These studies are now yielding results and reporting new findings on an almost-daily basis. PRWORA is slated to come up for reauthorization in 2002, and it is already clear that research findings will play a significant role in the debate over the directions that welfare reform should take from here.
The Panel on Data and Methods for Measuring the Effects of Changes in Social Welfare Programs of the National Research Council was formed in 1998 to review the evaluation methods and data that are needed to study the effects of welfare reform. Sponsored by the Office of the Assistant Secretary for Planning and Evaluation (ASPE) of the U.S. Department of Health and Human Services through a congressional appropriation, the panel has issued interim and final reports (National Research Council, 1999, 2001).
Early in its deliberations, particularly after reviewing the large number of so-called “welfare leaver” studies—studies of how families who left the TANF rolls were faring off welfare—the panel realized that the database for conducting studies of welfare reform had many deficiencies and required attention by policy makers and research analysts. In its final report, the panel concluded that welfare reform evaluation imposes significant demands on the data infrastructure for welfare and low-income populations and that “…inadequacies in the nation’s data infrastructure for social welfare program study constitutes the major barrier to good monitoring and evaluation of the effects of reform” (NRC 2001:146). The panel concluded that national-level surveys were being put under great strain for PRWORA research given their small sample sizes, limited welfare policy-related content, and, often, high rates of nonresponse (see also National Research Council, 1998). State-level administrative data sets, the panel concluded, are of much more importance with the devolution of welfare policy but are difficult to use for research because they were designed for management purposes. In addition, although they have large sample sizes, their content is limited. Surveys for specific states with more detailed content have been only recently attempted—usually telephone surveys of leavers—and the panel expressed concern about the capacity and technical expertise of state governments to conduct such surveys of adequate quality. To date, for example, many surveys of welfare leavers have unacceptably high rates of nonresponse. Overall, the panel concluded that major new investments are needed in the data infrastructure for analysis of welfare and low-income populations.
This concern led the panel to plan a workshop on data collection on welfare and low-income populations for which experts would be asked to write papers addressing in detail not only what the data collection issues are for this population, but also how the quality and quantity of data can be improved. A workshop was held on December 16–17, 1999, in Washington, DC. The agenda for the workshop is listed as an Appendix to this volume. Approximately half the papers presented at the workshop concerned survey data and the other half concerned administrative data; one paper addressed qualitative data. Altogether, the papers provide a comprehensive review of relevant types of data. The volume also contains four additional papers that were commissioned to complement the conference papers. One of them discusses methods for adjusting survey data for nonresponse. The other three papers focus on welfare leavers, a subpopulation of particular interest to Congress that a number of states have studied with grants
from ASPE, as well as the importance of understanding the dynamics of the welfare caseload when interpreting findings from these studies.
After the conference, the papers were revised, following National Research Council procedures, to reflect the comments of discussants at the workshop, panel members, and outside reviewers. The additional commissioned papers also were revised in response to comments from panel members and outside reviewers. This volume contains the final versions of the papers.
In this introduction, we summarize each of the 14 papers in the volume. Together, they are intended as a guide and reference tool for researchers and program administrators seeking to improve the availability and quality of data on welfare and low-income populations for state-level, as well as national-level, analysis.
The volume contains six papers on survey issues. They address (1) methods for designing surveys taking into account nonresponse in advance; (2) methods for obtaining high response rates in telephone surveys; (3) methods for obtaining high response rates in in-person surveys; (4) the effects of incentive payments; (5) methods for adjusting for missing data in surveys of low-income populations; and (6) measurement error issues in surveys, with a special focus on recall error.
In their paper on “Designing Surveys Acknowledging Nonresponse,” Groves and Couper first review the basic issues involved in nonresponse, illustrating the problem of bias in means and other statistics, such as differences in means and regression coefficients, and how that bias is related to the magnitude of nonresponse and the size of the difference in outcomes between respondents and nonrespondents. They also briefly review methods of weighting and imputation to adjust for nonresponse after the fact. The authors then discuss the details of the survey process, including the exact process of contacting a respondent and how barriers to that contact arise, noting that welfare reform may generate additional barriers (e.g., because welfare recipients are more likely to be working and hence not at home). They also provide an in-depth discussion of the respondent’s decision to participate in a survey, noting the importance of the environment, the respondent, and the survey design itself, and how the initial interaction between survey taker and respondent is a key element affecting the participation decision. They propose a fairly ambitious process of interviewer questioning, which involves contingent reactions to different statements by the respondent, a process that would require expert interviewers. They conclude with a list of 10 principles for surveys of the low-income population for improvement in light of nonresponse.
Cantor and Cunningham discuss methods for obtaining high response rates in telephone surveys of welfare and low-income populations in their paper, first identifying “best practices” and then comparing those to practices used in some
welfare leaver telephone surveys. The authors note the overriding importance of recognizing language and cultural diversity among respondents and the need to take such diversity into account in designing content and deploying interviewers. They then discuss specific issues in increasing response rates, including obtaining contact information in the presurvey process (e.g., from administrative records); obtaining informed consent to gather information needed for subsequent tracking; address-related problems with mail surveys; methods for tracing hard-to-locate respondents; dealing with answering machines; the importance of highly trained interviewers, echoing the emphasis of Groves and Couper; considerations in questionnaire design, including the critical nature of the introduction; and refusal conversion. Cantor and Cunningham then review a set of telephone surveys of welfare recipients and welfare leavers. They find that response rates often are quite low and that use of the telephone alone only rarely will obtain response rates greater than 50 percent, which is a very low number by the traditional standards of survey research. They suggest that higher, acceptable response rates will almost surely require substantial in-person followup, which can move the response rate up above 70 percent. The authors note that nonresponse is mainly an issue of inability to locate respondents rather than outright refusals, which makes tracing and locating respondents of great importance. They find that many welfare records are of poor quality to assist in tracing, containing inaccurate and out-of-date locator information, and they emphasize that expertise in tracing is needed in light of the difficulties involved. Refusal conversion is also discussed, with an emphasis again on the need for trained interviewers in using this method. Finally, the authors discuss random-digit dialing telephone surveys of this population (as opposed to surveys based on list samples such as those from welfare records) and explore the additional difficulties that arise with this methodology.
The paper by Weiss and Bailar discusses methods for obtaining high response rates from in-person surveys of the low-income population. The principles are illustrated with five in-person surveys of this population conducted by the National Opinion Research Certer (NORC). All the surveys drew their samples from administrative lists, provided monetary incentives for survey participation, and applied extensive locating methods. Among the issues discussed are the importance of the advance letter, community contacts, and an extensive tracing and locating operation, including field-based tracing on top of office-based tracing. The authors also provide an in-depth discussion of the importance of experienced interviewers for this population, including experience not only in administering an interview, but also in securing cooperation with the survey. The use of traveling interviewers and the importance of good field supervisory staff and site management are then addressed.
In their paper, Singer and Kulka review what is known about the effects of paying respondents for survey participation (“incentives”). Reviewing both mail and telephone surveys, the authors report that incentives are, overall, effective in increasing response rates; that prepaid incentives are usually more effective than
promised incentives; that money is more important than a gift; and that incentives have a greater effect when respondent burden is high and the initial response rate is low. They also note that incentives appear to be effective in panel surveys, even when incentives are not as high in subsequent waves of interviews as they are in the initial wave. After discussing the evidence on whether incentives affect item nonresponse or the distribution of given responses—the evidence on the issue is mixed—the authors review what little is known about the use of incentives in low-income populations. The little available evidence suggests, again, that incentives are effective in this population as well. The authors conclude with a number of recommendations on the use of incentives, including a recommendation that payments to convert initial refusals to interviews be made sparingly.
Mohadjer and Choudhry provide an exposition of methods for adjusting for missing data after the fact—that is, after the data have been collected. Their paper focuses on traditional weighting methods for such adjustment and includes methods for adjustment for noncoverage of the population as well as nonresponse to the survey. The authors present basic weighting methods and give examples of how variables are used to construct weights. They also discuss the effect of using weights derived from the survey sample versus weights obtained from outside data sets on the population as a whole. For population-based weights, they discuss issues of poststratification and raking that arise. Finally, they provide a brief discussion of the bias-variance tradeoff in designing weights, which is intrinsic to the nature of weights.
Measurement error is discussed in the paper by Mathiowetz, Brown, and Bound. The paper first lists the sources of measurement error in the survey process, which include the questionnaire itself; the respondent; the interviewer; and the conditions of the survey (interviewer training, mode, frequency of measurement, etc.). The authors then review issues relating to the cognitive aspects of measurement error and provide an extended discussion of the problem of questions requiring autobiographical memory. Other topics discussed in the paper include the issue of social desirability of a particular response; errors in response to sensitive questions; and errors in survey reports of earnings and income. A number of existing studies of measurement error are reviewed, but none are focused on welfare or low-income populations per se or on populations with unstable income and employment streams. The authors point out how earnings reports need to be based on salient events and give examples in which such salience is absent. A detailed review is then provided of what is known about measurement error in reports of transfer program income, child support income, hours of work, and unemployment histories. Finally, the authors list a number of issues that should be addressed that can help reduce measurement error, including proper attention by cognitive experts to comprehension of the question by respondents, care for the process of retrieval when writing questions, the use of calendars and landmark events, and a number of other questionnaire design topics. Methods for asking socially sensitive questions also are discussed.
Administrative records can be a valuable source of information about the characteristics and experiences of welfare program beneficiaries and past beneficiaries. To comply with federally mandated time limits on receipt of TANF benefits, states will need to develop the capability to track recipients over time, something not usually done in the old AFDC system. Such longitudinal tracking capability should make program records more useful for analysis; however, differences in programs across states will likely make it harder to conduct cross-state analyses. Research use of administrative records, whether TANF records or records from other systems (e.g., Unemployment Insurance) that can be used to track selected outcomes for welfare and low-income populations, poses many challenges.
Four papers on administrative data covering a wide range of different topics are included in the volume. The four address (1) issues in the matching and cleaning of administrative data; (2) issues of access and confidentiality; (3) problems in measuring employment and income with administrative data compared to survey data; and (4) the availability of administrative data on children.
Issues in the matching and cleaning of administrative data are discussed by Goerge and Lee. The authors begin by noting the importance of “cleaning” administrative data in a comprehensive sense, namely, converting what are management files into analytic files suitable for research use. They also note the importance of matching records across multiple administrative data sets (i.e., record linkage), which provides more information on respondents. A number of issues are involved in the cleaning process, many of which involve methods for assessing data quality and other aspects of the variables available in the administrative data. A number of important issues in record linkage also are discussed, perhaps the most important being the availability and accuracy of matching variables. The authors discuss deterministic and probabilistic record linkage as well as data quality issues in such linkage. The paper concludes with a number of recommendations on the cleaning and linking of administrative data.
Brady, Grand, Powell, and Schink discuss access and confidentiality issues with administrative data in their paper and propose ways for increasing researcher access to administrative data. The authors begin by noting that the legal barriers to obtaining access to administrative data by researchers often are formidable. Although laws in this area generally are intended to apply to private individuals interested in identifying specific persons, researcher access often is denied even though the researcher has no interest in identities and often intends to use the research results to help improve administration of the program. The authors provide a brief overview of the legal framework surrounding administrative data, confidentiality, and privacy, making a number of important distinctions between different types of issues and clarifying the content of several pieces of legislation—federal and state—governing access and confidentiality. They then turn to
a review of how 14 ASPE-funded state welfare leaver studies have dealt with these issues and whether general lessons can be learned. The authors conclude that while success in dealing with access and confidentiality problems has been achieved in many cases, the methods for doing so are ad hoc, based on longstanding relationships of trust between state agencies and outside researchers, and not buttressed and supported by an adequate legal framework. Twelve key principles are laid out for governing data access and confidentiality. Finally, the authors recommend more use of masking methods as well as institutional mechanisms such as secure data centers to facilitate responsible researcher access to and use of confidential administrative data.
Hotz and Scholz review the measurement of employment and income from administrative data and discuss why and whether measures taken from administrative data differ from those obtained from survey data. Employment and income are, of course, two of the key outcome variables for welfare reform evaluation and hence assume special importance in data collection. They find that there often are differences in administrative and survey data reports of employment and income and that the differences are traceable to differences in population coverage, in reporting units, in sources of income, in measurement error, and in incentives built into the data-gathering mechanism. The authors provide a detailed review of the quality of employment and income data from, first, the major national survey data sets; then from state-level administrative data taken from Unemployment Insurance records; and, finally, from Internal Revenue Service records. They review what is known about differences in reports across the three as well. The authors conclude with several recommendations on reconciling potentially different results from these data sources.
Administrative data on children are discussed in the paper by Barth, Locklin-Brown, Cuccaro-Alamin, and Needell. The authors first discuss the policy issues surrounding the effects of welfare reform on children and what the mechanisms for those effects might be. They identify several domains of child well-being that conceivably can be measured with administrative data, including health, safety (child abuse and neglect), education, and juvenile justice. In each area, they find that a number of different administrative data sets could be matched, in principle, with welfare records. They identify the exact variables measured in each data set as well. The authors find that good health measures often are present in various data sets, but they are often inaccessible to researchers, while child abuse and neglect data are more often available but have many data quality issues that require careful attention. Education and juvenile justice data are the least accessible to researchers and also contain variables that would only indirectly measure the true outcomes of interest. The authors find that privacy and confidentiality barriers impose significant limitations on access to administrative data on children, similar to the finding in the paper by Brady et al.
Qualitative data increasingly have been used in welfare program evaluations and studies. Although there is a fairly long history of the use of process analysis in formal evaluations, there is less history in using direct observation of study respondents or even using focus groups. Yet in attempting to learn how current or former welfare recipients are faring, qualitative data can provide information that neither survey nor administrative data offer.
The paper by Newman discusses the use of qualitative data for investigating welfare and low-income populations. Newman notes that qualitative data can assist in helping to understand the subjective points of view of families in these populations, provide information on how recipients understand the rules of the welfare system, uncover unexpected factors that are driving families’ situations, explore any unintended consequences of a policy change, and focus attention on the dynamic and constantly changing character of most families in the low-income population. The author reviews a range of methods, from open-ended questions in survey questionnaires to focus groups to detailed participant observation in the field, in each case listing the advantages and disadvantages of the method. Newman then discusses the use of qualitative data in several recent welfare reform projects to illustrate how the methods can be used. The author concludes with a recommendation that additional expertise in qualitative data be brought into state governments and that the use of these methods increase.
WELFARE LEAVERS AND WELFARE DYNAMICS
An initial focus of concern of policy makers has been the effects of PRWORA on people who left AFDC and successor TANF programs—“welfare leavers.” In response to a congressional mandate, ASPE provided grant funds to states and counties to analyze administrative records and conduct surveys of two cohorts of welfare leavers. In fiscal year 1998, ASPE provided grant funds to 14 jurisdictions (10 states, the District of Columbia, and 3 counties or groups of counties) to study welfare leavers. In fiscal year 1999 it provided funds to one state to also follow welfare leavers, and to six jurisdictions (five states and one county group) to study those who were either formally or informally diverted from enrolling for TANF—“divertees.”
In its interim and final report (National Research Council, 1999, 2001), the panel commented on some problems with leaver studies. These problems include differences in welfare caseload trends across states, such as faster declines in welfare rolls in some states than others and earlier program changes in states that sought AFDC waiver provisions, both of which could affect the comparability of data for cohorts of welfare leavers defined at a point in time. Also, states do not define leavers in the same way; for example, some states count “child-only cases”
as leavers and others do not. (In such cases, adult members of a family are not eligible for benefits but the children are.) The panel also emphasized the need for leaver studies to categorize sample cases by their previous welfare behavior, distinguishing between people who had been on welfare for a long period or only a short period or whether they had been cyclers (i.e., alternating periods of welfare receipt with periods of nonreceipt). To illustrate the problems in welfare leaver studies and best practice in such analyses, the panel commissioned three papers.
The first paper on this topic, “Studies of Welfare Leavers: Data, Methods, and Contributions to the Policy Process” by Acs and Loprest, reviews existing welfare leaver studies, including those funded by ASPE and others. It describes the definitions, methods, and procedures used in each study and identifies their strengths and weaknesses. The paper also compares some findings of leaver studies across studies that use different methodologies to illustrate points about comparability.
The second paper, “Preexit Benefit Receipt and Employment Histories and Postexit Outcomes of Welfare Leavers” by Ver Ploeg, uses data from the state of Wisconsin to analyze welfare leavers. The analysis breaks the sample members into “long-termers,” “short-termers,” and “cyclers” and shows that this categorization is important for understanding outcomes for these groups. The paper also stratifies the sample by work experience prior to leaving welfare and finds that there are sizable differences in employment outcomes across groups with more work experience compared to those with less work experience and that such categorizations also can be useful in understanding outcomes of leavers.
The last paper in this section and the final paper in the collection, “Experience-Based Measures of Heterogeneity in the Welfare Caseload” by Moffitt, uses data from the National Longitudinal Survey of Youth to construct measures of heterogeneity in the welfare population based on the recipient’s own welfare experience. A number of classifications of women in the U.S. population are used to characterize the amount of time they have spent on welfare, the number of welfare spells they have experienced, and the average length of their welfare spells. The same long-termer, short-termer, and cycler distinctions are used in the paper as well. The analysis of the characteristics of these groups reveals that short-termers have the strongest labor market capabilities but, surprisingly, that cyclers and long-termers are approximately the same in terms of labor market potential. More generally, the only significant indicator of labor market capability is the total amount of time a recipient has been on welfare, not the degree of turnover or lengths of spells she experiences. The analysis suggests that welfare cycling is not a very useful indicator of a recipient’s labor market capability and that the nature of welfare cyclers and reasons that cycling occur are not well understood.
National Research Council 1998 Providing National Statistics on Health and Social Welfare Programs in an Era of Change, Summary of a Workshop. Committee on National Statistics. Constance F.Citro, Charles F.Manski, and John Pepper, eds. Washington, DC: National Academy Press.
1999 Evaluating Welfare Reform: A Framework and Review of Current Work. Panel on Data and Methods for Measuring the Effects of Changes in Social Welfare Programs. Robert A.Moffitt and Michele Ver Ploeg, eds. Washington, DC: National Academy Press.
2001 Evaluating Welfare Reform in an Era of Transition. Panel on Data and Methods for Measuring the Effects of Changes in Social Welfare Programs. Robert A.Moffitt and Michele Ver Ploeg, eds. Washington, DC: National Academy Press.