Survey Design Options for the Measurement of Persons with Work Disabilities1
Nancy A. Mathiowetz, Ph.D.2
As noted in a recent publication, “Capturing the essential medical, physical and social aspects of disability by means of survey data is a difficult task” (Altman, 2001). The complexity stems, in part, from differences in conceptual models of the enablement–disablement process and alternative interpretations of various conceptual models. The incongruity between various conceptual models of disability and the Social Security Administration’s (SSA’s) model based on its statutory definition of work disability adds further complexity to the measurement process when one is particularly interested in estimating the pool of potential applicants or the number of those who would be classified as persons with work disabilities as a result of SSA’s benefits decision process. The former requires that the survey questions represent an accurate operationalization of the SSA statutory definition; the later requires additional information related to the SSA decision process.
Given the complexity of the phenomena of interest, agreement with respect to the conceptual model does not imply consistency in the
operationalization of the concept. Alternative operationalizations are evident in the variety of types and numbers of questions used in various surveys to measure functioning and participation. Empirical evidence suggests that even minor variations in how one operationalizes the concept of disability can result in significant variations in the estimates of the population of persons with disabilities in the United States (see, for example, McNeil, 1993). Consideration of alternative design options for household-based survey measures of the population of persons with work disabilities requires that SSA invest in understanding how divergent measurement may affect estimates of the potential pool of applicants for SSA benefits as well as estimates of the population of potential beneficiaries.
The task of using household-based surveys for the measurement of persons with work disabilities is further complicated by the dynamic state of the field at the time of this writing. A number of research activities, both within the United States and internationally, related to the measurement of disability via the survey interview process, will most likely result in major changes to question wording and questionnaire design over the next decade. Several federal statistical agencies, including, but not limited to, the Bureau of Labor Statistics and the Bureau of Justice Statistics, are testing questionnaires to meet legislative or executive branch mandates related to the production of statistics by disability status. The adoption of the second revision of the International Classification of Impairments, Disabilities, and Handicaps (renamed ICIDH-2: International Classification of Functioning, Disability, and Health) by the 54th World Health Assembly provides a classification system and framework for the development of disability measures in surveys. Much of the international research on disability measurement is focused on the development of valid and reliable instruments that map conceptually to the ICIDH-2, including measurement of the environment. Other research activities are attempting to address gaps in the current state of knowledge concerning the measurement error properties of disability statistics
In light of the challenges facing SSA, this paper attempts to outline and discuss design options related to the measurement of persons with work disabilities and the measurement of persons eligible for SSA benefits. The paper examines a broad range of alternatives for SSA to consider, ranging from the development of its own data collection and measurement system3 to the use of other federal data collection efforts for ongoing monitoring. We begin by examining the disparities between the definition of persons with work disabilities used by SSA and the models
underlying the measurement of disability in other surveys. In addition, we review what is currently known with respect to the error properties of survey-based estimates of the population of persons with work disabilities.
CONCEPTUAL ISSUES TO CONSIDER IN THE MEASUREMENT OF PERSONS WITH WORK DISABILITIES
As noted above, one of the issues of concern in the measurement of persons with work disabilities involves differences in the conceptual models underlying the measurement. The Social Security Act defines disability as the “inability to engage in any substantial gainful activity by reason of a medically determinable physical or mental impairment which can be expected to result in death or can be expected to last for a continuous period of not less than 12 months” (Section 223(d)(1)). The SSA definition implies a direct relationship between an individual’s attributes—specifically related to pathology, impairment, and functional limitation—and work disability. In contrast, Nagi (1991, p. 317) and other contemporary theorists characterize disability as a “… relational concept; its indicators include individuals’ capacities and limitation, in relation to role and task expectation, and the environmental conditions within which they are to be performed.” As noted by Jette and Badley (2000, p. 17):
The fundamental conceptual issue of concern is that a health-related restriction in work participation may not be solely or even primarily related to the health condition itself or its severity. In other words, although the presence of a health condition is a prerequisite, “work disability” may be caused by factors external to the health condition’s impact on the structure and functioning of a person’s body or the person’s accomplishments of a range of activities.
Most measures of work disability currently in use in U.S. federal surveys assume (or imply) that work disability relates to an individual’s attributes with respect to functional limitations; almost all such questions leave it to the respondent to attribute his or her labor force participation to an underlying health condition. However, the movement in the measurement of persons with disabilities and persons with work disabilities is toward measures that incorporate an understanding and assessment of the external factors that influence participation by individuals in work. It is conceivable that as new measures for the assessment of persons with work disabilities are developed and adopted in ongoing federal data collection efforts—that is, measures that incorporate an assessment of the environment (including accommodations, adaptations, and barriers)—the
discrepancy between survey-based estimates of the population with work disabilities and the population eligible for SSA benefits will increase. To the extent that SSA may rely on survey-based measures of disability drawn from non-SSA-sponsored surveys, it will be imperative for SSA to understand how such measures relate (both conceptually and statistically) to the statutory definition of work disability.
METHODOLOGICAL ISSUES IN THE MEASUREMENT OF WORK DISABILITY IN SURVEYS
As was evident in the report of the Workshop on Survey Measurement of Work Disability (Mathiowetz and Wunderlich, 2000), the measurement of persons with work disabilities via household-based surveys is subject to various sources of error that may or may not result in bias in the estimate of interest. Although not unique to the measurement of persons with disabilities, the complexity of the concept as well as the very nature of the phenomena of interest suggests a need to be particularly vigilant with respect to the potential impact of errors of both observation and nonobservation on estimates of the population. There is little to no research examining the impact of nonobservation (both noncoverage and nonresponse) on estimates of the population of persons with disabilities or persons with work disabilities. One could speculate on the non-ignorable nature of nonresponse of such estimates, hypothesizing that persons with disabilities are less likely to participate in household-based surveys. For example, the nature of the disability (e.g., sensory) may limit participation for particular modes of data collection. Empirical investigations are needed to understand the extent to which errors of nonobservation bias survey-based estimates of persons with disabilities.
Similarly, we have some empirical data to indicate that survey-based estimates of persons with disabilities may be plagued by problems of measurement error. The empirical evidence suggests that factors as diverse as the mode of data collection, the sponsorship of the survey, the nature of the respondent (i.e., whether the individual reports for him- or herself or is reported for by someone else in the household), the specific question wording, and questionnaire context, as well as the order of the questions, may affect the estimates of the population and the stability of those estimates (see Mathiowetz, 2000). However, to date, little empirical research has isolated the effects of specific design features. For example, questions administered as part of the National Survey of Health and Activities (NSHA) may yield very different estimates of the population when administered using a different mode of data collection or when administered as part of a different study (which may reflect a change in context, question order, and sponsorship).
This lack of research with respect to the impact of either errors of nonobservation or errors of observation is critical in thinking about alternative design features for the measurement of persons with work disabilities. For example, let us assume that SSA decides to conduct a NSHA-type survey every k years, with monitoring of the pool of eligible applicants based on a subset of the NSHA questions in the form of a topical module administered as part of the Survey of Income and Program Participation (SIPP) in the intervening years. Most likely, the two surveys—that is the NSHA-type survey and the SIPP—will vary on a number of essential survey design features, including survey context, mode of data collection, and survey sponsorship, all of which could impact the level of reported work disabilities. The two surveys may also differ with respect to coverage of the U.S. population. Differences in the rate of nonresponse, or even the mix of the types of nonresponse (i.e., refusals versus noncontact), could lead to differential nonresponse error across the two surveys. Without a systematic program of research that addresses the relative effects of differential nonresponse as well as the effects of various design features on levels of measurement error, SSA will be without empirical-based information with which to determine whether year-to-year variations in estimates are attributable to true change or differences in the design of the two studies. Considerations of alternative survey designs for the measurement of persons with disabilities cannot ignore the potential impact of either measurement error or errors of observation on estimates.
ONGOING DEVELOPMENTS IN DISABILITY MEASUREMENT
Because notions of disability and models of influences on disability are constantly changing, any ongoing system to monitor the phenomena must be able to adapt and change over time. This can be accomplished only with ongoing monitoring of the scientific endeavors in the field and investment in new methods of measurement. Much of the present research in the measurement of disability in surveys is focused on developing question items that map conceptually to the International Classification of Functioning (ICF, formerly ICIDH and ICIDH-2) (World Health Organization, 2001).
The ICF model advocates the measurement of disability on a continuum as opposed to the binary categories of disabled and nondisabled that have predominated in the survey measurement of disability. In addition, the model depicts disability as an interaction between a person, his or her health condition, and the environment in which the person lives, an integration of medical and social models of disability (biopsychosocial model). The nature of the physical and social environment can either limit or assist so as to result in various levels of activity and participation by an
individual. As such, the framework differs from theoretical models that depict disability as a process beginning with impairment and ending with social role or behavioral restrictions or models that focus on disability as merely a functional limitation, that is, the restriction in physical functional activity and task activity associated with the impairment (Altman, 2001). In addition, the use of neutral terminology (as opposed to negative terminology such as handicap or disability) is emphasized in the ICF framework.
The ICF focuses on nine domains: (1) learning and applying knowledge, (2) general tasks and demands, (3) communication, (4) mobility, (5) self-care, (6) domestic life, (7) interpersonal interactions and relations, (8) major life areas, and (9) community, social, and civic life. Within these nine domains, body function or capacity independent of environment as well as performance (which is dependent on environment) are to be measured. For example, an individual may have a latent allergy (body function) that manifests itself only when the person is exposed to the allergy agent, which may or may not therefore affect performance. Performance includes both execution of actions by an individual (activity) and involvement in life situations such as work (participation).
The release of the ICF in the spring of 2001 has resulted in a number of research activities related to the design of questionnaires that can be mapped to the ICF framework. Much of this research focuses on question wording to measure activity (and the use of assistance in the performance of activities) and participation (both extent of participation and satisfaction with participation). There is a great deal of research interest related to the development of a single reliable and valid question that could be proffered for use in censuses internationally. In addition, questionnaire design research has focused on the construction of both short- and long-form questionnaires with known measurement error properties. The movement from dichotomous response options to continuous response classification has led to questions as to the impact of cutpoint decisions on estimates of the “disabled” population as well as their impact on the distribution of the characteristics of the population.
SOCIAL SECURITY ADMINISTRATION’S DISABILITY SURVEYS: HISTORICAL PERSPECTIVE
A key component of good fiscal management for the Social Security Administration is having sufficient information to understand and predict growth in the pool of persons eligible for disability benefits as well as understand the factors that impact the application process, including motivation to apply for benefits. Medical models of disability have historically been insufficient to explain unexpected growth in the size of the
applicant pool. Factors extrinsic to the benefits programs, for example, cyclical changes in the economy, as well as social and cultural issues, have in the past resulted in changes in acceptance rates and unexpected increases in program expenditures. One means by which to understand the magnitude and characteristics of the pool of eligibles, as well as the intrinsic and extrinsic factors that lead to applying for benefits, is to develop an ongoing surveillance system related to work disability.
The idea of a measurement system related to understanding the incidence and prevalence of those eligible for benefits and those applying for benefits is not a new idea. Between 1966 and 1978, the SSA sponsored a number of data collection efforts designed to measure the prevalence of persons with work disabilities. The first of these surveys was the 1966 Social Security Survey of Disabled Adults. The survey consisted of an area probability sample drawn from seven different frames, including a frame of Social Security disability beneficiaries, denied applications for disability benefits, and disabled recipients of public assistance (Haber, 1973). The survey consisted of a two-stage design; the first stage was a screening interview of 30,000 households to identify adults ages 18 to 64 years of age with limitations in their ability to work. Personal interviews were then conducted with those identified in the screening, approximately 8,300 individuals. Individuals were classified according to the extent of their self-reported capacity for work: severely disabled (unable to work altogether or unable to work regularly), occupationally disabled (able to work regularly, but unable to do the same work as before the onset of disability or unable to work full time); and secondary work limitation (able to work full time, regularly, and at the same work, but with limitations on the kind or amount of work performed). The study provided information on the prevalence of persons with work disabilities among those ages 18 to 64, as well as information about access to and utilization of health services, income, and demographic characteristics.
The second major survey of disability was the 1972 Survey of Disabled and Nondisabled Adults. The survey included both disabled and nondisabled adults of working age and focused on revisions of the estimates of prevalence as well as factors associated with the development and duration of disability (Bye and Schechter, 1982). Those interviewed in 1972 were reinterviewed two years later to examine changes in both disability status and economic status, as well as the relationship between changes in disability and economic status and entitlement under both the Social Security Disability Insurance (SSDI) and the Supplemental Security Income (SSI) programs.
The large growth in the number of SSDI beneficiaries between 1966 and 1975 was the impetus for the 1978 Survey of Disability and Work. As noted in the documentation for the survey (Bye and Schechter, 1982, p. 2):
The rate of growth of the DI (disability insurance) program gives rise to the question of what accounts for the increase. Investigations of changes in labor market conditions and changes in the application of program eligibility criteria address this problem at a macrolevel. This survey allows for a complementary investigation of the reasons for growth with the individual as the unit of analysis. The policy focus of the survey is on the decision to apply for disability benefits.
Between 1969 and 1979, the SSA sponsored the Retirement History Survey, a longitudinal survey designed to understand the conditions under which persons decided to take Social Security benefits before reaching age 65. Although not a disability survey per se, detailed information concerning health and work limitations was collected during the six waves of data collection, and early analysis of the data indicated the importance of health problems as precursors of early retirement (Bixby, 1976).
In addition to surveys sponsored by the Social Security Administration, scholars of the disability process and the disability application process have relied on longitudinal data collection efforts such as the National Longitudinal Study (NLS) and the Health and Retirement Survey (HRS) to understand factors associated with early withdrawal from the labor force. As noted by Sheppard (1977, pp. 163–164):
The NLS project, and the type of analysis it makes possible, has a value not associated with the usual cross-sectional project in that it provides an opportunity to make predictions regarding subsequent work or life status. It is also important to make the point that, despite the criticisms that have been made regarding the utility of self-reported health status the individual’s own judgment of his or her health status or work capacity at one point in time is a useful and reliable predictor of subsequent labor force or life status (emphasis in original).
Despite the richness of the data resulting from these various survey efforts, the survey efforts do not permit the analyst to understand fully both the individual factors and the environmental factors that result in a person’s shifting from the status of potential applicant to actual applicant. Although previous research has permitted examining macrolevel relationships between economic changes and changes in the size of the applicant pool (e.g., Yelin et al., 1980; Bound and Waidmann, 2000), understanding the contributions of both individual and environmental characteristics to microlevel decisions to apply requires data that are longitudinal and capture information related to an individual’s decision process.
ONGOING MEASUREMENT OF PERSONS WITH WORK DISABILITIES: THE DEVELOPMENT OF A WORK DISABILITY MEASUREMENT SYSTEM
Developing a disability measurement system is not dissimilar to the design of a complex survey consisting of multiple components (e.g., the National Health and Nutrition Examination Survey [NHANES], which includes a household-based survey and a medical examination coupled with laboratory testing); for each, one should begin with a clear statement of the objectives of the system and a description of the measures of interest. Once the objectives are established, system designers can focus on the necessary components and operation of the system to meet those objectives (e.g., data sources and frequency of collection). Questions concerning the population to be studied, the frequency and period of data collection, the information to be collected, determination of the provider or providers of information (e.g., household respondents, abstracts from administrative records), and decisions concerning analysis, frequency of reporting, and dissemination of information should be addressed in the design of the system.
The utility of a measurement system is a function of the extent to which the data are used to make decisions, set policy, or implement changes. An assessment of a system’s utility should be evaluated in light of the objectives of that system; for example, to what extent does the system permit the detection of changes in the rate of application for disability benefits?
In addition to assessing the utility of a measurement system, other attributes of a well-designed system include its simplicity (in both structure and operation), flexibility, and sensitivity and specificity (to accurately detect cases and to distinguish true positives from false positives); the representativeness of the population being studied; and the predictive value of the system.
A critical element of a measurement system is to clearly define and identify a “case.” The disability definition for entitlement of benefits is the same for both the Title II Disability Insurance Program and the Title XVI Supplemental Security Income program, although other requirements differ. As noted earlier, disability is defined under the two programs as “inability to engage in substantial gainful activity because of any medically determinable physical or mental impairment lasting at least 12 months.” Of interest in a disability surveillance system is not simply measurement of prevalence and the socioeconomic conditions linked to disability, but also understanding both the individual and the environmental factors that lead to changes in the SSA benefits application process. The system will need measures of the prevalence of the eligible pool as well as
measures that predict application. Key to such a system will be sufficient data to understand macro- and microlevel factors that distinguish participating and nonparticipating eligibles. A work disability measurement system will have to include means for modeling the decision process from both the demand side (the individual) and the supply side (the Social Security Administration).
DESIGN OF A WORK DISABILITY MEASUREMENT SYSTEM
The design of a work disability measurement system must consider the analytic needs of the system and the impact of alternative design options on meeting those analytic goals as well as the impact of various sources of survey error (should the design include the use of household-or provider-based surveys). Among the design issues the system will have to address are the following:
data source or sources;
cross section versus longitudinal design;
mode of data collection;
self versus proxy response status; and
specific wording of question, response option presentation, and overall context.
Data Source or Sources Among the various data sources that could be included, alone or in combination, in the design of a disability surveillance system are household-based survey data, provider-based survey data, administrative record data, and physical examination data. Among the options with respect to household- or provider-based survey data are stand-alone surveys that permit rich and deep national data on the size of the disabled population (e.g., similar to the NSHA, which is currently being sponsored by the SSA), survey modules administered as part of some preexisting data collection effort (e.g., a supplement to the Current Population Survey or the SIPP), or the incorporation of a limited number of questions on existing national surveys, for example, the National Health Interview Survey (NHIS) or the Behavioral Risk Factor Surveillance System. Each of these options with respect to household- or provider-based surveys has implications for the error properties of the resulting estimates, including coverage, sampling, nonresponse, and measurement error. In addition, consideration much be given to the costs associated with obtaining data from alternative sources. The use of administrative record data potentially suffers from similar sources of error, including
operational definitions of disability that are incongruent with those of the Social Security Administration.
Cross-Sectional Versus Longitudinal Design As noted above, a longitudinal design permits researchers to address analytic capabilities that are not possible with repeated cross-sectional designs, especially those related to the decision to apply for benefits, including both individual factors that influence the decision and the impact of environmental and macrolevel changes (e.g., economic) on the decision to apply for benefits. Longitudinal designs require that additional decisions be made concerning the length of the panel (i.e., the number of years individuals are followed), the frequency of data collection, and the means for following individuals who move.
Use of a panel survey design, with repeated measurements with the same individuals, facilitates more efficient estimation of change over time (compared to the use of multiple cross-sectional samples). However, panel designs may be subject to higher rates of nonresponse (cumulated across waves of the data collection) or panel conditioning bias, an effect in which respondents alter their reporting behavior as a result of exposure to a set of questions during an earlier interview.
Periodicity If survey data are collected, how often should data collection occur? What are the ramifications of more frequent or less frequent data collection for the utility of the data? How is periodicity affected if one decides to utilize a longitudinal design? For repeated cross-sectional data collection?
Mode of Data Collection For survey data collection, a decision as to the mode or modes of data collection will have to be addressed. Little is known about the effect of mode of data collection on the measurement error properties of self-reports of disability and impairments. Selection of mode or modes of data collection involve a complex decision concerning costs, response rate objectives, and measurement error. With respect to costs, face-to face-data collection is significantly higher than other modes. It is for this reason that several federal surveys involving panel designs of households have moved toward mixed modes, with the initial interview conducted face to face and subsequent interviews conducted either by telephone or face to face. Face-to-face data collection is often considered the preeminent mode for data collection, due in part to the opportunity to gather data that are not feasible via other modes (e.g., physical measurements, interviewer observation) and in part to the perception that face-to-face data collection continues to achieve higher rates of response than other modes. However, one must consider that mode comparisons of
response rates are confounded by survey sponsorship, with the federal government in the United States conducting or sponsoring most of the face-to-face data collection.
The one consistent finding with respect to the effect of mode of data collection suggests that to the degree the information is considered sensitive or socially undesirable, one is more likely to collect more accurate data via self-administrative modes of data collection. For example, although the National Household Survey of Drug Abuse is conducted as a face-to-face interview, questions concerning illicit drug use and other sensitive behaviors are reported via self-administration. The choice of multiple modes of data collection may be desirable from the perspective of reducing coverage bias (e.g., dual-frame sampling designs) or improving response; however such a design decision to reduce errors of nonobservation may come at the expense of an increase in measurement error.
Self and Proxy Response Status Should only self-response be accepted for household surveys of disability? If so, what are the ramifications on nonresponse bias? If proxy responses are accepted, what impact does this design choice have on the measurement error properties of the reporting of disability?
The use of proxy reporters—that is, asking individuals within a sampled household to provide information about other members of the household—is another design decision that is often framed as a trade-off among costs, sampling errors, and nonsampling errors. The use of proxy informants to collect information about all members of a household can increase the sample size (and hence reduce the sampling error) at a lower marginal data collection cost than increasing the number of households. The use of proxy respondents also facilitates the provision of information for those who would otherwise be lost to nonresponse because of an unwillingness or inability to participate in the survey interview. However, the cost associated with the use of proxy reporting may be an increase in the rate of errors of observation associated with poorer-quality reporting for others compared with the quality that would have been obtained under a rule of self-response.
The limited literature comparing self- and proxy reports in the measurement of disability has focused on the reporting of activities of daily living (Mathiowetz and Lair, 1994; Rodgers and Miller, 1997). Persons for whom data are obtained by proxy are often classified as having more functional limitations than those for whom the data are obtained by self-response; research is inconclusive as to whether this discrepancy is a function of overreporting on the part of proxy informants, underreporting on the part of self-respondents, or both.
Specific Wording of Question, Response Option Presentation, and Overall Context Although there is empirical evidence to indicate that estimates of the population differ as a result of different question wording, the presentation of response alternatives, the order of questions, and the overall context of the questionnaire, little is known concerning the measurement error properties of alternative approaches, nor is there empirical literature that addresses the marginal effects of various question design features.
As is evident from the design choices discussed above, each choice impacts the error structure of the estimates of persons with disabilities and the analytic capabilities that can be addressed with the resulting data. Also evident is the lack of information with respect to the specific impacts of design choices on the reporting of impairments and disabilities; this point was one of several made in the Workshop on the Measurement of Work Disability (Mathiowetz and Wunderlich, 2000).
One could consider a number of permutations of the options outlined above in designing a work disability measurement system; these options could be arrayed along lines of richness of the data, quality of the data, and costs. For example, consider a system with the following attributes:
continuous, longitudinal multimode household-based data collection (so as to facilitate participation among those who are unable or unwilling to answer via a single mode);
medical examination for those meeting a particular threshold based on the household data and a subset of those who are classified in the category adjacent to the threshold; and
links to administrative records.
Such a design would facilitate the analysis of change over time in the size of the pool of eligibles and applicants, and the understanding of the individual and environmental factors that influence application for benefits, and could simulate the impact of alternative decision processes, provided the household survey, medical examination, and administrative records collected or contained the information necessary for such modeling. In contrast, one could consider a design that is characterized by a small number of questions on disability included as part of repeated cross-sectional surveys. Such a design would limit analysts to monitoring the size of the pool of eligibles and possibly, if cross-walk analytic capabilities had been developed, the size of the pool of applicants. However, such a design does not facilitate understanding how individual, environmental, and macrolevel changes impact the application process; with such a design one can only observe a correlation between macrolevel changes and changes in the size of the applicant pool, but cannot understand the
relationship between the two at the individual level. Between these two extremes, one could consider a large number of variations.
Underlying the hypothetical continuum of design options is a second continuum related to the costs of alternative design combinations; the analytic capabilities associated with the richest design come at the cost of higher expenditures for data collection. Regardless of the choices made with respect to mode, frequency, cross-sectional versus longitudinal design, and other design features, further research to understand the error properties associated with alternative design features is necessary to more fully inform the decision process with respect to the cost–error trade-offs.
The choice of data collection design options is not one unique to the Social Security Administration. Other federal agencies responsible for constructing a social indicator series or providing data for the purposes of public policy or funds management have struggled with similar dilemmas. For example, the Agency for Healthcare Research and Quality (AHRQ, formerly the Agency for Health Care Policy and Research), faced a similar design issue with respect to the provision of information concerning health care utilization, expenditures, and health insurance coverage. During the 1970s and 1980s the agency relied on periodic household data collection efforts, supplemented with provider records and administrative data, as the basis for producing estimates (e.g., the 1977 National Medical Care Expenditure Survey, the 1986 National Medical Care Utilization and Expenditure Survey). These detailed data, yielding themselves to years of alternative analyses, formed the basis of long-range policy guidance. However rich these data were, such a design did facilitate research related to understanding shifts in health care utilization or expenditure patterns. In recent years, the design has shifted toward continuous data collection, with a longitudinal panel (the Medical Expenditure Panel Survey, see www.ahrq.gov/data/mepsinfo). The shift toward a longitudinal data collection effort with an ongoing (continuous) rotating panel design both increased the analytic capabilities of the data and reduced the gaps in data needed for public policy.
Partnerships with Other Federal Agencies
As noted above, one of the choices for SSA is whether to sponsor its own ongoing surveys or to enter into partnership with other federal agencies to obtain a small set of measures.
In short, what are the administrative, financial, and technical staffing burdens of mounting an ongoing survey, and what is the scope of informational needs? If there are many features of the population that are not now being well described, then a separate SSA survey may easily be justified as a small fraction of the funds allocated to fulfill its mission.
However, if a smaller set of measures would sufficiently measure the population of interest (both the pool of eligibles and the pool of applicants), as well as address the other analytic goals of SSA, then partnership with another federal agency or agencies may be a cost-effective option within the work disability measurement system.
The candidate surveys for ongoing monitoring include the American Community Survey, the American Housing Survey, the Behavioral Risk Factor Surveillance System, the Current Population Survey, the Medical Expenditure Panel Survey, the National Crime Victimization Survey, the National Health Interview Survey, the National Health and Nutrition Examination Survey, the National Household Survey of Drug Abuse, and the Survey of Income and Program Participation. Three criteria were used for selection of the candidate surveys discussed here: (1) each represents an ongoing federal data collection effort; (2) the sample size is sufficient, on an annual basis, to support SSA data requirements; and (3) the survey instrument currently includes or is planning to include measures of disability as part of the questionnaire. Some candidate surveys did not meet all three criteria but were included for consideration due to some unique design feature of the study. For example, the annual samples for the National Health and Nutrition Examination Survey and the Medical Expenditure Panel Survey (MEPS) are relatively small as compared to some other surveys (n = 5,000 and n = 15,000 persons annually, respectively); however, each of their designs benefits from a complementary component. In the case of the NHANES, the design includes a medical examination. In the case of the MEPS, the design includes data from medical care providers and providers of health insurance. Similarly, the National Household Survey of Drug Abuse does not presently include any measures of functional limitation or disability; however, the design includes both an interviewer-administered questionnaire and a self-administered set of questions that may be beneficial in the assessment of disability.
The relevant questions to be addressed in choosing a partner survey include the following:
How large a sample is interviewed each year? What standard errors are likely to be obtained for key disability prevalence statistics?
Will the addition of disability measures to the interview be consistent with the measurement goals of the original survey? Are there possibilities of context effects that could damage the accuracy of prevalence estimates?
Are there existing measures in the survey that might be used as explanatory variables for disability status indicators? Can the survey offer SSA other informational benefits beyond being a vehicle to produce disability prevalence statistics?
Is the survey of high quality? What evidence is there about coverage, nonresponse, and measurement error properties of key statistics?
How frequently can estimates be updated? Will monthly prevalence estimates be generated, annual estimates, etc.?
Is the mode of administration of the survey compatible with the measures chosen from NSHA?
What restrictions, if any, will SSA staff have on access to microdata from the surveys? Can SSA analysts use the data for other analyses of importance to SSA or will they be given only statistics produced from the survey data?
Will the mission of the sponsoring agency be aided by a partnership with SSA in measuring disability status? With the obligation of many federal household surveys to provide indicators of disability, can SSA expertise in work disability be viewed as a desirable complement to the sponsor’s staff skills?
A partnership between two or more federal agencies may be beneficial to all parties involved. For example, collaborative efforts could lead to building a consensus concerning the measurement of disability in federal surveys. Additional funds from SSA to support data collection efforts may also support increases in sample size, further questionnaire development and refinement, and expand the analytic utility of any one data collection effort.
The ideal partner survey would have a sufficiently large sample4 to provide SSA with prevalence estimates that were stable enough to protect policymakers from erroneous impressions. It would have very low coverage and nonresponse errors. It would be conducted frequently, giving SSA the ability to model seasonal effects in the size of the pool and to estimate the impact of economic shocks. It would contain other measures that would be of utility to SSA in addressing other important management problems: Are all demographic subgroups changing disability prevalence in the same way over time? What are the major health and demographic correlates of disability status?
The chief threat to the feasibility of this partnering option for ongoing monitoring is that most federal household surveys are already using long and complex instruments, filled with measures of great value to existing constituencies. Seeking to add measures to these instruments faces zero-sum conflicts with existing obligations of the sponsors. The single most important sign of optimism is that several of the surveys are facing man-
dates to begin measurement of disability status in order to learn how the disabled subpopulation differs from others on the key topics covered by the surveys.
The discussion that follows outlines the characteristics of several large, ongoing federal data collection efforts, some of which do include measurement of impairments, functional limitations, and work limitations and disability. Each of these potential partner surveys has strengths and weaknesses that have to be assessed in light of the questions enumerated above.
Agency for Healthcare Research and Quality
Medical Expenditure Panel Survey The household component of the Medical Expenditure Panel Survey is designed as a continuous, overlapping panel design, in which members of each panel are interviewed for a two-year period concerning health care use, expenditures, sources of payment, and insurance coverage. Approximately 6,000 households are selected from those responding in the prior year to the National Health Interview Survey; household members are then interviewed five times over a 24-month field period, yielding information on approximately 15,000 persons for each panel. To produce estimates for any one particular calendar year, the data can be pooled across two distinct nationally representative samples, yielding an effective sample size of approximately 30,000 persons annually. The MEPS sample design targets for oversampling those with family income less than 200 percent of the poverty level, working age adults predicted to have high health care expenditures (based on information obtained in the NHIS interview), and adults 18 years of age and older classified as having a functional limitation, measured in terms of activities of daily living (ADLs) and instrumental activities of daily living (IADLs). In addition to the household panel survey, the MEPS design includes a survey of medical providers identified by MEPS respondents; data are collected from these medical providers to verify and supplement information provided by the household respondents. A second supplemental data collection involves contacting employers and other providers of health insurance identified by the household respondents so as to collect information on insurance characteristics that household respondents cannot usually provide.
Bureau of the Census
The U.S. Bureau of the Census conducts two surveys of interest, the American Community Survey and the Survey of Income and Program Participation.
American Community Survey The American Community Survey (ACS) is a new initiative of the Bureau of the Census, designed to eventually replace the long-form decennial census. The design of the survey closely resembles the decennial census, with self-administration of mail-delivered questionnaires. The sample consists of a rolling sample of addresses, with approximately 3 million households sampled annually. At present, the questions on disability replicate those included in the long form of the year 2000 decennial census. Drawing on the Canadian experience in conducting the Health and Activity Limitation Surveys (HALS), the ACS could be used as a first-stage screening instrument for the identification of individuals likely to be impaired or disabled; follow-up, in-depth interviews could be targeted at those individual identified via screening questions in the ACS as well as a subsample of those not identified as disabled, so as to capture the false negatives via the longer instrument.
Survey of Income and Program Participation SIPP is a multipanel longitudinal survey of adults, that measures their economic and demographic characteristics. Participants are interviewed once every 4 months; the duration of each panel ranges from 2.5 to 4 years. The questionnaire for the SIPP includes a core set of questions administered every wave and a set of topical modules, which are administered periodically. One of the topical modules that has been administered in previous panels concerns disability and functional limitations. The redesigned topical module administered in 1997 and 1999 covered a broad range of questions concerning disability and functional limitations, including sensory limitations, use of mobility aids, ADLs, IADLs, and upper- and lower-body functional limitations.
Bureau of Justice Statistics
National Crime Victimization Survey As a result of Public Law 105-301, the Bureau of Justice Statistics (BJS) is required to produce victimization rates by developmental disability status beginning in the year 2003. To meet this requirement, BJS has begun to develop and test a 20-question model dealing with health conditions, impairments, and disabilities, covering a broad range of disabilities, not just developmental disabilities. The questions have undergone testing in a cognitive laboratory and will be field-tested this spring among a population of persons with developmental disabilities in California. These questions would be added to the National Crime Victimization Survey (NCVS), a rotating panel design survey in which participants are interviewed every 6 months over a 3.5-year period. Similar to the design of the Current Population Survey, the sample unit for the NCVS is the housing unit; participants who move during the life of
the panel are not followed. Approximately 50,000 households are interviewed every six months, with information collected on approximately 100,000 persons ages 12 and older annually.
Bureau of Labor Statistics
Current Population Survey The Current Population Survey (CPS) is a rotating panel design in which households are interviewed monthly for four months, not interviewed for eight months, and then interviewed monthly for an additional four months. The questionnaire consists of a core set of questions concerning labor force participation and, depending on the month of interview, a periodic or topical module. For example, detailed information concerning sources of income is collected for all participants who are interviewed during the month of March. The current CPS questionnaire obtains information concerning disability only when the respondent volunteers that he or she is disabled in response to the question concerning whether he or she worked last week for pay. In addition, respondents who are currently not employed are asked whether they have a disability that prevents them from accepting any kind of work during the next six months. Data are collected from approximately 60,000 households (on approximately 94,000 persons ages 16 and older) every month.
In response to Executive Order 13078, which requires the Bureau of Labor Statistics in conjunction with other federal agencies to produce accurate and reliable employment rate data for people with disabilities, the Bureau of Labor Statistics is evaluating a set of questions for possible inclusion in the CPS. About 20 questions were tested in cognitive labs and are currently being field-tested in the National Comorbidity Survey.
Centers for Disease Control and Prevention/National Center for Health Statistics
Behavioral Risk Factor Surveillance System The Behavioral Risk Factor Surveillance System (BRFSS) is a state-based surveillance system active in all 50 states and the District of Columbia. The data are collected by telephone, among adults ages 18 and older, on a monthly basis by individual states. Sample sizes vary by state and year but must be of sufficient size so as to permit state-level estimation for measures included in the core module. The BRFSS has three components: (1) a core questionnaire used in all states; (2) standardized modules chosen for inclusion by individual states; and (3) questions developed by each state. Beginning in the year 2000, the core module included the same two questions included in the National Health Interview Survey. One of the standardized supplemental modules
(“Quality of Life”) includes six questions on functional limitations and impairments. Disability measures can also be found in several additional standardized modules.
National Health Interview Survey The National Health Interview Survey is a cross-sectional survey conducted throughout the calendar year (nationally representative replicate samples are introduced every two weeks) that collects information about the amount and distribution of illness in terms of limited activities, chronic impairments, and health care services received by persons of all ages. All persons in the household are asked two questions concerning disability: (1) Are you limited in any way in any activities because of physical, mental, or emotional problems? (2) Do you now have any health problem that requires you to use special equipment, such as a cane, a wheelchair, a special bed, or a special telephone? Sampled adults (one per household) are asked a series of questions concerning functional limitations and the degree of difficulty associated with going out, participating in social activities, and participating in leisure activities in the home. Over the course of a year, data are collected on approximately 98,000 persons (core questionnaire), with additional information obtained from approximately 32,000 sampled adults and 14,000 sampled children.
National Health and Nutrition Examination Survey The redesigned National Health and Nutrition Examination Survey collects information on health and nutritional status of adults and children in the United States through household-based interviews as well as physical examinations. Although the NHANES was a periodic survey that began in the 1960s, in 1998 the study was redesigned so as to provide continuous monitoring of the population. The annual survey consists of interviews with approximately 5,000 persons per year. The household questionnaire includes questions concerning limitations in activities for children and for adults—limitations related to work, mobility, cognition, and functional activities. A medical examination also provides information on physical limitations as well as assessment of mental health and cognitive function.
Housing and Urban Development
American Housing Survey The American Housing Survey (AHS) consists of a national biannual sample and a rolling annual metropolitan sample conducted by the Census Bureau for the Department of Housing and Urban Development. National data are collected every other year from a fixed sample of housing units supplemented by a new construction sample. The national sample consists of approximately 55,000 housing
units. In addition to the national sample, a metropolitan sample for each of 46 selected metropolitan areas is collected about every four years, with an average of 12 metropolitan areas included each year. Each metropolitan area sample covers approximately 4,800 or more housing units. The disability questions focus on questions related to mobility within the housing unit, limitations in activities of daily living, and sensory impairments.
Substance Abuse and Mental Health Services Administration
National Household Survey of Drug Abuse The National Household Survey of Drug Abuse (NHSDA) is a annual survey of approximately 67,000 persons concerning drug and alcohol use. The survey consists of an interviewer-administered as well as a self-administered section using computer-assisted interviewing techniques. The sample design consists of state-level cross-sectional samples, thereby facilitating state-level estimation. The current questionnaire does not include measures of functional limitations or disability.
Considering alternative design options for the ongoing measurement of persons with work disabilities requires careful consideration of alternative sources of error, the impact of various sources with respect to the estimates of interest, the analytic objectives of the data collection effort, and costs. As is evident from the preceding discussion, the empirical literature is, to a large extent, silent with respect to the impact of various sources of error on estimates of persons with work disabilities. Alternative designs will vary in the richness of the analytic capabilities of the resulting data; such capabilities will have to be balanced against issues of respondent burden and costs.
One issue is clear with respect to the design of an ongoing data collection effort to monitor the size of the applicant pool for SSA benefits. The lack of empirical data to inform the design at the present time emphasizes the need for SSA to undertake ongoing research as an integral part of the design of any data collection effort. An ongoing methodological research program, coupled with whatever design is implemented, will provide assessments of the error properties of the current design and inform future design decisions. The agenda for research in survey measurement outlined in the Institute of Medicine Workshop on the Measurement of Persons with Work Disabilities may provide a starting point for such a research effort.
Altman B. 2001. Definitions of disability, and their operationalization, and measurement in survey data: An Update. Research in Social Science and Disability 2:77–100.
Bixby L. 1976. Retirement patterns in the United States: Research and policy interaction. Social Security Bulletin 3–19.
Bound J, Waidmann T. 2000. Accounting for Recent Declines in Employment Rates Among the Working-Age Disabled. Ann Arbor, MI: Population Studies Center, University of Michigan.
Bye B, Schechter E. 1982. 1978 Survey of Disability and Work. SSA Publication No. 13-11745. Washington, DC: U.S. Department of Health and Human Services.
Haber L. 1973. Social planning for disability. Journal of Human Resources (February):33–55.
Jette A, Badley E. 2000. Conceptual issues in the measurement of work disability. In: Mathiowetz N, Wunderlich GS, eds. Survey Measurement of Work Disability: Summary of a Workshop. Washington, DC: National Academy Press.
Mathiowetz N. 2000. Methodological issues in the measurement of work disability. In: Mathiowetz N, Wunderlich GS, eds. Survey Measurement of Work Disability: Summary of a Workshop. Washington, DC: National Academy Press.
Mathiowetz N, Lair T. 1994. Getting better? Change or error in the measurement of functional limitations. Journal of Economic and Social Measurement 20:237–262.
Mathiowetz N, Wunderlich G, eds. 2000. Survey Measurement of Work Disability: Summary of a Workshop. Washington, DC: National Academy Press.
McNeil J. 1993. Census Bureau Data on Persons with Disabilities: New Results and Old Questions about Validity and Reliability. Paper presented at the 1993 Annual Meeting of the Society for Disability Studies, Seattle, Washington, 1993.
Nagi S. 1991. Disability concepts revisited: Implications for prevention. In: Pope A, Tarlov A, eds. Disability in America: Toward a National Agenda for Prevention. Washington, DC: National Academy Press.
Rodgers W, Miller B. 1997. A comparative analysis of ADL questions in surveys of older people. Journal of Gerontology 52B:21-36.
Sheppard H. 1977. Factors associated with early withdrawal from the labor force. In: Wolbein SL, ed. Men in the Pre-Retirement Years. Philadelphia: Temple University. Pp. 163–215.
World Health Organization. 2001. International Classification of Functioning, Disability, and Health. Geneva, Switzerland: World Health Organization.
Yelin EH, Nevitt MC, Epstein WV. 1980. Toward an epidemiology of work disability. Milbank Memorial Fund Quarterly/Health and Society 58(3):384-415.