Click for next page ( 62


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 61
4 Data Needs Within the next few years, policy debates about the retirement income security of current and future generations of Americans are likely to require a range of modeling capabilities with which to evaluate and project the likely effects of alternative policy proposals. However, as is clear from the preceding chapter, there are important gaps and uncertainties in what is known about the behaviors and processes that affect retirement income security. These gaps stem from deficiencies in available data, which hamper or preclude the development of robust analytical models and parameter estimates from them. In some cases, notably for employers, there are insufficient data with which to describe the distribution of relevant employer and employee characteristics, much less to support analysis of behavioral change over time. These deficiencies need to be remedied and the knowledge base further developed before it will be possible to construct reasonably adequate projection models with broad capabilities. Moreover, existing retirement-income-related projection models and the associated databases have many limitations and do not generally provide an adequate platform on which to develop improved models once new data and research knowledge become available (see Chapter 5 and Appendix D; see also Hollenbeck, 1995~. Thus, there is a great deal of work to do to prepare for the policy debate. With very tight budget constraints, the question is one of priorities. We conclude that agencies should devote the bulk of their limited resources over the next few years to data collection and analysis rather than making significant investments in large-scale projection models. This conclusion is based on our assessment that some of the gaps in needed data and basic research are so critical that projection models, no matter how elaborate or elegant, cannot compensate for them. An 61

OCR for page 61
62 ASSESSING POLICIES FOR RETIREMENT INCOME example is the failure of existing research to adequately explain observed savings patterns in the population. Moreover, past experience suggests that it takes more time to collect new data and analyze them than it does to build a projection model to use data and research in estimating the likely consequences of policy changes. There are more than a few instances in the history of policy analysis when models were built in a span of weeks or months. As an example, the prototype of the Carter admin- istration's welfare reform projection model, KGB, was completed in a few weeks (see Citro and Hanushek, 1991: 107-114~. It is very rare that needed new data can be obtained and analyzed sufficiently in so short a time, particularly if the data set is rich enough to be useful. A small-scale, quick-response survey of employers' health care costs was completed for use in the recent health care reform debate within 10 months from initial design to final output (Ponikowski, Scheible, and Wiatrowski, 1994), but its scope was very limited. More detailed information on employers' health care plans and costs that would have been useful, from a large survey for which the design work had begun in spring 1993, was still not avail- able by the end of 1996 (Hing et al., 1995~. THE LESSON FROM HEALTH CARE REFORM The experiences and reflections of policy analysts who provided estimates for the 1993-1994 health care reform debate underscore the panel's conclusion about giving priority to investments in data and research. Box 4-1 describes the major players in health care reform estimation and the models and databases they used.1 More lead time and prior investment would have facilitated the development of usable projection models for estimating the likely effects of alternative health care reform plans. Indeed, some timely investments that were made in model building were helpful (e.g., the extension of the TRIM2 model to simulate em- ployer-provided health care benefits). Conversely, inexperience with building health care projection models, particularly with a database not previously used for this purpose, was a handicap. That was the case, for example, for the Agency for Health Care Policy and Research (AHCPR), which based its new AHSIM model on the 1987 National Medical Expenditure Survey (NMES). However, the model builders themselves pointed to major difficulties that stemmed from the absence of critical data and research; see Box 4 2.2 Existing data were so inadequate that it was difficult to develop an agreed-upon "baseline" scenario that is, a representation of the current distribution of health insurance coverage, utilization of services, costs, and other characteristics of consumers, Information for this discussion and Box 4-1 comes from Bandeian and Lewin (1994), Bilheimer and Reischauer (1996), Citro and Hanushek (1991, esp. Chap. 5), Nichols (1996), Office of Technol- ogy Assessment (1993, 1994), Shells (1996), and interviews with analysts. 2See footnote 1.

OCR for page 61
DATA NEEDS 63 providers, and insurers let alone simulate the likely effects of alternative re- forms relative to the baseline. Bilheimer and Reischauer (1996:149), speaking from the Congressional Budget Office (CBO) experience, flatly concluded: "To construct a comprehensive picture of the health care system is impossible with today's databases. What is known must be pieced together from several inad- equate or dated surveys and sources." Also lacking was up-to-date research with which to estimate behavioral responses to changes in the health care system. Bilheimer and Reischauer (1996:152) noted that "such studies can credibly illuminate only the effects of marginal changes in the current environment. The effects of large, systemic changes that major health care reform proposals would generate are far outside the boundaries of knowledge that can be gleaned from existing economic re- search or even from social experiments." Nonetheless, they identified several areas in which better data about the current system would have made it possible to develop more credible estimates of the effects of reform proposals (see Box 4- 2; see also Bandeian and Lewin, 1994~. In the absence of key data and research, rough estimates based on very inadequate information or simply guesses were used for values of behavioral parameters, and no projection model, however complex or elegant, could com- pensate for the lacking information. Different models incorporated widely differ- ent assumptions in key areas, and consequently, there were significant differ- ences in estimates of the likely effects for the same reform plan (see Office of Technology Assessment, 1993, 1994~. Differences in databases for example, between the March 1994 Current Population Survey (CPS) and the 1987 NMES aged to 1993-1994 also contributed to differences in estimates. Moreover, in the heat of debate, it proved difficult, if not impossible, to develop new sources of needed information on a timely basis. Subsequently, and anticipating future health care policy debates, AHCPR and the National Center for Health Statistics (NCHS) are working to implement a major reorganization and expansion of health-related surveys that could meet many of the information requirements identified by participants in the 1993-1994 effort (Hunter and Arnett, 1996~. The picture is much the same for retirement-income-related policy analysis, namely, that key descriptive and analytical data with which to develop credible projections of the likely effects of current and alternative policies are missing or incomplete. As with health care reform, even the best data and analysis are unlikely to resolve the uncertainty associated with major policy changes, such as privatization of Social Security (which would resemble a system of universally mandated Individual Retirement Accounts), because there is no historical experi- ence on which to base any models.3 For example, an important issue about 3However, research on the experience of other countries with privatization schemes may help develop projections for a u.s. system.

OCR for page 61
64 ASSESSING POLICIES FOR RETIREMENT INCOME ............................................................................................................................. ^,,u,,m,~, ~ ~ -ion An, l, -up u ~ Or -m,e,,nl or ne,a,,,~ ,,a,,nu num,,a,,n privatization is whether it will increase or decrease personal saving. One can argue that privatization will educate people about saving and what it can do for them and thereby lead millions of people who now save little or nothing to save much more, in addition to their mandatory privatized accounts. But one can also plausibly argue that people will be more confident of actually obtaining payments from their dedicated personal accounts than they are of receiving Social Security benefits and thus will curtail other forms of saving (see Mitchell and Zeldes, 1996). Nonetheless, as with health care reform, filling key gaps in data and research knowledge can go a long way to make it possible to develop credible projections of the likely effects of many retirement-income-related policy alternatives. We urge that priority be given to strengthening the base of data and research for

OCR for page 61
DATA NEEDS 65 - ............................................................................................................................ ............................................................................................................................. ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ............................................................................................................................. ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: retirement income modeling through improvement of existing data sets when- ever possible and through new data collection when necessary, and including appropriate levels of funding for analytical research and validation. The remainder of this chapter addresses: the dimensions of databases that should be considered in designing and evaluating cost-effective retirement- income-related data collection systems, whether new or modified; issues in con- tinuing existing panel surveys of middle-aged and older people in order to pro- vide sufficient longitudinal observations for analysis of consumption, savings, and retirement behavior of individuals; issues in developing new and improved cross-sectional and panel data for employers and their workers in order to under- stand labor demand and employer decisions about pensions and other benefits; issues in linking administrative and survey data, which can be a cost-effective

OCR for page 61
66 ASSESSING POLICIES FOR RETIREMENT INCOME ............................................................................................................................. MA l D A e D e 1 A De ~-v-~---v^--l--^---~-^r-~ r--v--~ ~--~-~-~ ~ ~ ~-m-~ N ..... . . of nouseno at surly pm-w a ED a E-e so off ne u-~ meow ..... ... """"""""' ' ' ' """"h'' " " ' 'h" 'I'd""""' '' ' 'b' ''--' ''""""h'' ' '1' h"""' at' '' """'' ~ 'h'' ' Ith~'' ' ' """'' ti'l'i' ' ti'' ' ~ 't'h'' """"""""""""""""""" ........ .. ..... . . ..... ..... . . . . ......... . . . ... .. .................... .... . . ............... - tit i i 1 l 1 i I r :':':':':':':':':':':':':':':~:I':'E:V O=:':':':'E:':I'V :~:':':':LV' ':':':':'V :I':':':':W:I':'E:~.:L I :'t='I:::: WV :I':'n='E:'O':':':'~::I::~:I:::I:::::::~:I:::I:~:I: V : 1 - :~:~:I:::::::~:V::V ::~:1::~:~::::::: - V:~:I:~::::::::I:::I:~:V::~.:::::::::::: stat ES a' d ................. ........... .......... ..... ............... . ......... ................. .......... ...... ... ............................................................................................................................. E j E EKE i 1l .- r f i -~- ~ ~- ~~- -.~. -~-~--~.- -~I-~-Ily~ l~l ~ l~lLl l w~l~ l~lvl~ ll w ........................................................................................................................................ ~\IO.E .~ .~ne. ~l-.-~.-t-. 1~1 EtO EaU ~ ':':':':':':':':':':':':':':'' :':':':':':':':':':':.': ':':':':':':':':':':'':':':':':,':':':':':':':':':':':':':':':r:':':':':':':':':':':':':':':':':' :':':':':':':':':':':':':':':':':':':':':':':':':':':'A':':':':':':':':':':':':':':':':':':':':': Ta-m-l-- y---slal' -s~ oT~ e-m-p- oye-es~ ^n~ e-mp- -oye-r---su-' ey---conc -u-cTeo~ ~y---T 1-e~ -e-a ~ 1~ ::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::: ::::::::: :::::::::::::::::::::::::::::::::::::::: .................................................. ~/ ........ ....... .. ~............... ... . f '~ '#-~-~ -'&~-6-~ ~ '--~-~-'&~-6-~ ~-'&##-l'#-~&~--'-'VI~ l'~l-~l'-'-'-~-'W'-'--'V-'&~ '~l'-~-'-' E-'~ /~ l~ e ~-eal~ E ~aE ~lnaE cing ^~mlE !sirmlon p~pam$ aE E'U'al"""'eSl!""""""""" ~ ' ''~''' ' ' '' ' ' '~' ~ 0'i~ ' E~ ' 6'~ '1~ ~t' t' -~ E~ t' ~ ~ ' 1 ~$ 1 ............................................................................................................................ ''''''''''''''" ' "'''''''' " ' ' ' ' ",''''' ' d' ''''i''' "' ' ''i' ' '''"'""""''' """" ' ' i'l'' b'l'' """t' ''""'' t' ' ' """" ''"""' ob' ' t

OCR for page 61
DATA NEEDS 67 - ............................................................................................................................ ............................................................................................................................. ................................................................................................................................................................................. :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ................................................................................................................................................................................ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ............................................................................................................................. ............................................................................................ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ........................................................................................................................................................................... ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

OCR for page 61
68 ASSESSING POLICIES FOR RETIREMENT INCOME ............................................................................................................................. ............................................................................................................................. means of obtaining high-quality measures of key variables with minimal added expenditure; and issues of data validation, internally and in comparison with other sources. DIMENSIONS OF DATABASES Databases differ on a number of dimensions, including source, reporting unit and universe, type and frequency, scope, size, data collection methodist, accuracy or validity, uncertainty, ease of use, level of aggregation, and cost; see Box 4-3. There are trade-offs to consider among these dimensions. For example, the level of uncertainty in survey responses can be reduced by increasing the sample size; however, such a decision will increase costs. Similarly, an expansion of the number and detail of survey questions will make a survey more useful for a wider

OCR for page 61
DATA NEEDS 69 range of purposes, but increase its cost and the burden it places on respondents. Such an expansion may also make the data more difficult for analysts to use. There are also trade-offs with regard to the use of administrative records instead of survey data. Administrative records are usually thought to be more accurate than survey responses. They are also relatively inexpensive to use for analysis because the costs of data collection and processing have already been incurred for administrative purposes. However, such records usually lack de- tailed content, and their content may change from year to year to reflect changes in program data requirements. Also, administrative records are not without errors (e.g., Social Security and Medicare records may have inaccurate information on whether individuals are still alive). Indeed, when comparing information for the same variable in an administrative records data set with a survey estimate, it is important to take account of likely errors in both sources and of differences in definitions and other features that could affect the comparison. Finally, adminis

OCR for page 61
70 trative records are often inaccessible to researchers because of concerns about maintaining confidentiality for individuals and other reporting units. Data sources are rarely satisfactory for both analytical and projection model- ing purposes on every dimension. In fact, analytical and projection models often use different types of data. For example, analytical models of individual behav- ior generally require rich longitudinal data from panel surveys, but models that project individual outcomes rarely use panel surveys as their primary database because of small sample sizes and restricted universes.4 Yet if the projection model database does not contain a similarly rich set of variables as were used to estimate key behavioral relationships in an analytical model, it will not be pos- sible to take advantage of the most advanced behavioral models. Instead, the behavioral relationships will have to be reestimated with a reduced variable set, or such procedures as statistical matching (see Cohen, l991b) will have to be used to impute needed variables to the primary database. (We discuss this issue further in Chapter 5.) ASSESSING POLICIES FOR RETIREMENT INCOME il PANEL DATA ON INDIVIDUALS An underlying theme throughout our report and the papers we commissioned (Hanushek and Maritato, 1996) is the need to understand how people reach their retirement years. What enters into decisions about working as people age? What are the implications of different employment paths for pension plan participation and the level of benefits received in retirement? How do government and em- ployer policies affect personal savings behavior and the ultimate wealth accumu- lations that influence both retirement decisions and well-being in retirement? Questions such as these emphasize two key issues that have implications for data collection and analysis. First, many of the antecedents of retirement out- comes are present long before any actual retirement decisions. Second, behavior that is related to policy often has a long time horizon, with individuals looking many years into the future as they make decisions. To obtain suitable data for analysis, it is essential to follow individuals over many years in order to under- stand their retirement behavior and outcomes. This central fact leads us to emphasize the development of panel surveys that obtain longitudinal data by interviewing the same individuals over time. Panel surveys, which have become increasingly common to study individual behavior, permit investigation of behavior that evolves and that has implications over long periods. Moreover, panel surveys provide a variety of ways for dealing with the heterogeneity across individuals that can complicate analyses based 4An exception is a recently developed public assistance model STEWARD (Simulation of Trends in Employment, Welfare, and Related Dynamics) which directly uses data from the Na- tional Longitudinal Survey of Youth (NLSY) to simulate the effects of welfare reform proposals on program participation (Jacobson and Czajka, 1994).

OCR for page 61
DATA NEEDS 71 solely on a cross-section of individuals. Finally, panel surveys can often permit corrections for measurement and observational errors because consistency checks for individuals over time can aid in separating errors from true changes for individuals. Of course, the need to follow the same individuals over long periods implies that a panel survey is likely to be expensive certainly more expensive than a one-time cross-sectional survey of equivalent size and perhaps more expensive than a repeated cross-sectional survey.5 Also, for cost reasons, it may be difficult in a panel survey to refresh the sample frequently enough to address such ques- tions as whether patterns of behavior remain the same for newer cohorts or to maintain representation of a changing population (e.g., to represent immigrants). The trade-offs often suggest the need for cross-sectional data collection. For example, we argue below for collecting data to understand employer behavior that is relevant for retirement income security, but we believe that the first step is to improve cross-sectional data. Although a panel may later be appropriate, the initial efforts which include learning about what data to collect and how and what the sampling frame should be would most appropriately be thought of as a cross-sectional effort. Also, there is a need for regularly updated descriptive information on trends in the characteristics of employers, work forces, and ben- efits that more efficiently comes from repeated cross-sections than from panels.6 Similarly, repeated, nationally representative cross-sectional surveys are needed to provide important data on trends in the population that are relevant to tracking and understanding retirement outcomes (e.g., trends in ages at retire- ment). Nonetheless, the central longitudinal data with which to analyze indi- vidual behavior and individual decisions should almost certainly be gathered through panels of individuals. Although cross-sectional surveys can use retro- spective questions to collect longitudinal information, such as employment and earnings histories (and in some cases this is done), the quality of retrospective information is much less, compared with panel surveys, because of recall and other errors, which may be large (see, e.g., Kennickell and Starr-McCluer, 1995~. Also, cross-sectional surveys are limited in the amount of retrospective informa- tion that they can collect due to considerations of respondent burden. Panel surveys, in contrast, can obtain a wealth of information with which to understand different life courses and retirement outcomes. 5Whether a panel survey is more or less expensive than a repeated cross-sectional survey with the same number of sample members is affected by many factors, such as frequency of interviews, costs of obtaining an interview (a panel survey may have higher costs to locate sample members but lower costs to obtain an interview once the sample member is located), and others. 6Pane1 surveys will provide consistent time series for a population as well if a new panel is introduced on a frequent basis, such as every year; however, costs will be prohibitive unless the size or length (or both) of each panel is reduced, which will, in turn, reduce the usefulness of each panel for longitudinal analysis.

OCR for page 61
DATA NEEDS 121 and employers at very low marginal cost. The major difficulty concerns how to provide access to such data for research and modeling purposes when their use raises concerns about maintaining the confidentiality of respondent information.20 Records on Individuals Greater access to Social Security Administration earnings and benefits records could advance many important areas of retirement-income-related analysis and modeling. As a stand-alone database, SSA records have the potential to improve U.S. data on mortality at older ages and to study the relationship of socioeco- nomic status (as measured by earnings levels) to mortality.21 Such studies could be carried out by SSA staff or by researchers who are sworn in as SSA employees to prevent disclosure of confidential data (as has been done for some Census Bureau studies). Given the importance of mortality projections for projecting retirement income security, we urge that priority be given to mortality research with SSA records. More problematic from the perspective of confidentiality protection are pro- posals to link SSA records with survey responses. Some studies have been done, but they have been limited. Exact-match files of SSA records with the March 1973 and 1978 CPS, developed by the Census Bureau, were made publicly avail- able (the 1973 file included an exact match with IRS records), as were exact- match files of SSA records with the Retirement History Survey. However, no exact-match files of SSA records with CPS data for years later than 1979 have been developed for public use. The Census Bureau has developed exact-match files of Social Security records with the 1984, 1990, and 1991 panels of the Survey of Income and Program Participation (SIPP), but these files are made available only to SSA analysts with strict restrictions on use. The Census Bureau recently released a public-use, exact-match file of the March 1991 CPS with selected data from IRS administrative tax records. In this file, techniques of data- switching and the addition of noise were used to mask the data so that no sensi- tive information that could identify specific individuals was released. More extensive matches of IRS data with CPS and SIPP files have been used to evalu- ate the quality of income reporting in the March CPS and SIPP and for research on improved weighting schemes to reduce the variance of SIPP estimates, but these files are only available internally to Census Bureau staff. The Department of Labor sponsored a 1977-1978 Survey of Private Pension Benefit Amounts that linked private employer pension plan records on beneficia 20See Duncan, Jabine, and deWolf (1993) for a review of confidentiality and access issues for federal statistical data and promising avenues for addressing the difficulties. 21If SSA tracked marital status of all beneficiaries, then SSA records could also support needed analysis of the relationship of marital status to mortality.

OCR for page 61
22 ASSESSING POLICIES FOR RETIREMENT INCOME ries with SSA earnings and benefits records (Office of Pension and Welfare Benefit Programs, 1985~. This survey used the Form 5500 database to sample private pension plans and obtain information from plan administrators on ben- efits paid to individual plan participants. The matched records of pension and Social Security benefits and earnings were used to analyze the contribution of employer pensions to retirement income security (e.g., to calculate earnings re- placement rates). The response rate from plan administrators was low (about 50%), and large defined contribution plans were underrepresented. However, the matched data were viewed as more accurate than household survey estimates of pension retirement benefits, which are typically underreported. No public-use files were made available from the survey, and it would presumably be difficult to do so if it were to be repeated. Legislative restrictions are one reason that publicly available exact-match files of SSA and survey data have not been developed in recent years. Another reason is that statistical agencies have become more concerned with questions of privacy and confidentiality of data and the potentially adverse effects on survey response rates if people believe that their replies are not held in strict confidence. Nevertheless, there is a strong need for exact-match files. Calculations of expected Social Security benefits require either complete histories of covered earnings or summary variables, such as average indexed monthly covered earn- ings over a worker's span of employment, that in turn derive from earnings histories. Such histories are difficult to obtain retrospectively in surveys and would require decades of data collection to obtain prospectively. Earnings histo- ries, including earnings above the payroll tax ceiling (available in SSA records beginning in 1979), are also helpful in calculating expected benefits from the types of employer pension plans that calculate benefits on the basis of several years of highest earnings with the employer or that specify employer contribu- tions as a percentage of earnings. Finally, benefit histories are useful to evaluate and augment survey responses of Social Security income. Plans are now being implemented to make available on a restricted basis exact-match files of HAS/AHEAD and SSA records that will provide very valu- able information for analysis purposes. (Links will also be made with HCFA Medicare data and possibly with state Medicaid data.) A three-pronged strategy will be followed to protect confidentiality. First, linked data files with complete earnings and benefits histories will be made available on a limited access basis only to researchers who sign nondisclosure agreements that include penalties for violation. Second, public-use files will include only summary variables derived from the earnings histories. Third, estimated Social Security entitlements that have been computed under a variety of assumptions will be made available to HRS users under restricted conditions (Mitchell, Steinmeier, and Olson, 1996~. We support the preparation of exact-match files that link SSA and other administrative records with HAS/AHEAD and urge that arrangements be made to perform these linkages on a regular basis. We also encourage the Census Bureau and SSA to consider the development of SIPP-SSA exact-match files that can be

OCR for page 61
DATA NEEDS 123 made publicly available by following the strategy of HRS and AHEAD, namely, to provide summary variables derived from the earnings histories that facilitate the calculation of expected Social Security benefits. (lams and Sandell [1996i, SSA researchers who are using matched SIPP-SSA files for Social Security ben- efit modeling, make a similar recommendation.) There are plans to include SSA information on Social Security benefit type, and whether the respondent has died, in publicly available SIPP files. We support these efforts and also urge consider- ation of developing SIPP files for public release that include derived variables from SSA earnings records. Records on Employers Administrative records for employers, such as financial statements that are ab- stracted in Compustat and the Form 5500 data series, provide useful information for analytic purposes. These particular data sets, unlike SSA records, are derived from public documents, but problems can arise when they are merged with other data for which confidentiality protection is promised (e.g., BLS or Census Bu- reau surveys). Employers are sensitive about the release of data that could be useful to competitors, and it can be very difficult to mask such variables as employer size sufficiently to prevent disclosure and at the same time maintain the analytical value of the data. Indeed, microdata from employer surveys, let alone matched survey and administrative records data, are often not made publicly available at all. Sometimes agencies are willing to retabulate confidential data at the request of researchers. For example, BLS has linked Form 5500 data with the EBS and run analyses for outside researchers. However, the researchers were not them- selves given access to the microrecords, and they found this mode of data access very limiting (MacDonald, 19951. One possible strategy to provide greater access to matched employer data is to adopt the strategy proposed for exact-match files of SSA earnings histories with HAS/AHEAD. Under this strategy, researchers could gain access to the complete data sets under very strict conditions of use. At the same time, public- use files could be developed in which key administrative records variables are summarized in a manner that is most relevant for research needs and other steps are taken (e.g., limited geographic identification) to prevent disclosure. If this approach is adopted for matched employer data, it would be important for agen- cies to consult with researchers to determine the appropriate summarized vari- ables. The Census Bureau is pursuing another very promising approach for re- search access to its employer data files, including the LRD, which have not been available for use except at the Bureau's headquarters. This approach may pro- vide a model for other agencies. Several years ago, the Census Bureau, in

OCR for page 61
24 ASSESSING POLICIES FOR RETIREMENT INCOME collaboration with the National Bureau of Economic Research, a private organi- zation, established a secure Research Data Center at its Boston regional office. Researchers may come to the center, be sworn in as special Census Bureau agents, and use the data sets on site. Census Bureau employees must review any output that researchers take with them to ensure that it does not identify specific respondents. Although more limiting than use of microdata at one's own institu- tion, this arrangement is far preferable for researchers in the Boston area than having to come to the headquarters in the Washington, D.C., area. The success of the Boston data center has led the Census Bureau to set up a second center at Carnegie Mellon University in Pittsburgh, and the agency is exploring research- ers' interest in having similar centers in other major cities around the country. Recommendations 11. Matched files of panel survey responses and key administrative records should be regularly produced for retirement-income-related policy analysis and projection purposes. Examples include exact matches of survey records with Social Security earnings histories and benefit records, Medi- care and Medicaid records, and the National Death Index. The added infor- mation in matched files is obtainable at low marginal cost and is essential for analysis of retirement and savings decisions and the effect of medical care use and expenditures on retirement security. 12. Agencies should collaborate on the development and oversight of matched data sets for individuals and employers, with input from research- ers on content. They should also vigorously explore creative solutions for providing research access to exact-match files that safeguard the confidenti- ality of individual responses. Possible solutions include: (1) developing public-use files that contain summary variables derived from the adminis- trative records portion of the matched file; (2) requiring researchers to sign nondisclosure agreements with significant penalties for violations; and (3) providing researchers with access to matched files on site at secure data centers. DATA VALIDATION Validation of databases that are used in behavioral and projection models is as important as validation of the models themselves. Sampling errors in data inputs are one source of uncertainty of model estimates; more important, nonsampling errors can introduce both uncertainty and bias into model estimates. Data valida- tion is essential to identify the types and magnitudes of such errors. It is also essential for survey methodological research, which should be part of every data

OCR for page 61
DATA NEEDS 125 collection program to determine procedures for improving data quality at the outset by improving questionnaire design and data collection procedures. There are many sources of nonsampling errors in both surveys and adminis- trative records. One source is unit nonresponse, that is, failure by a reporting unit to provide any information at all. Panel surveys are subject to cumulative unit nonresponse over time, or attrition, as people become tired of cooperating with the survey or move and cannot be traced. Other sources of error are nonresponse to specific items, overreporting (e.g., a false positive report of pension coverage), underreporting (e.g., reporting an amount less than actually received for an in- come source), and misclassification (e.g., reporting a defined benefit pension plan as a defined contribution plan or vice versa). Yet another source of error in surveys is undercoverage of the population because the sampling frame does not include all people or employers in the universe or other reasons. For example, household surveys of the general population almost always have low coverage rates of such groups as young minority men.22 Surveys and administrative records systems use several methods to try to compensate for nonsampling errors, such as adjustment of survey weights for population undercoverage and attrition, imputation for item nonresponse, and editing for misclassification or inconsistency in reporting. However, these proce- dures are not likely to maintain all of the underlying relationships and may themselves be a source of bias. Validation Methods Validation involves estimating overall error rates and the contribution of indi- vidual sources of error to them, including the contribution of weighting, imputa- tion, and editing procedures. The problem is to determine appropriate bench- marks for comparison. There are several approaches to validation; see Box 4-10 for examples of their use. Reinterviews Asking a sample of respondents the same question in a reinterview cannot establish which answer is correct, but it can indicate whether the responses are robust in the sense that there is a high level of consistency between the answers given originally and in reinterviews. Use of Alternative Question Wording Experimentation with different ques- tion wording, or other aspects of questionnaire design (such as the order in which questions are asked) may determine that the responses are sensitive to such 22Coverage rates are developed by comparing survey population estimates by age, race, and sex to census population estimates updated by births, deaths, and estimated net immigration; Medicare records are used for the elderly.

OCR for page 61
126 ASSESSING POLICIES FOR RETIREMENT INCOME ............................................................................................................................. :::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::: ::::::::::: ::::::: :::::::::::: ::::::::: ::::::::::: ::::::::::::::::::::::::: ::::::::::: :::::::: :::::::::::: ::::::::: S::::::::::::::::::::::::::::::::::::::::::::::::: .: of: _' :_::: ::: :: ::: ' ,,.A :': :':'A:':':':':~:':'A _' . ' ,,.A :': :: : ::, ::::~:: :: : ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ~ Aaa~aalO HepO=lna toners in ASSel virtues DO Lyme gremlin - ... a.. a... .... ~ ................. - - ~ ~-l.-.Y.Y-~-:--~--~- -a-- ! advises ~ co-m-pa em ag-g-reg-a Ed asleep val-u-e . ~.. ......................................................................................................................... ... ~ ............................................................................................. _ ............................................................................................................................. .......................... - -. - - '''''''''''''''~!''L1 1 't~'''1 - ''1 '';' 1~!''1''!'1'!'''V `!''I A.. l - - l v ~ Y - '.''''''1''1'!'w!'w''~'1'w'' - !'~w'1'~ ~1'tw!'~=''1!''1''1'w~V l'`;t'l'!'~''''''' - _ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::: . ~ A~te--Re on~ng--~ ~ In on--l.n ome--Amo .n~ C d ~ .............................. . ....................... .................... ............................................ ............................................................................................................................ ~ acne Coon-- -~og-ers~ ~---~--~-~-:-~---~--- ~ I a--e ~ co-m-p-a-reu~ l-n-c-ome~ ~-po-~l-n-g~ lint .... .. ~ . . . . . . . - ~trl l~t^^ ~^m tn~ ~l~tl^~l ln~d m^---~r ^---v~-~-~-l-l- .~.t:':':'~:~^d I lr tO---l-~t-l-~-~-~ l--n~l-r~ '''''''''''''''~'t'i'" .:L.~'''!'!'~'i'!'!''''t'l'!' - .-'''i'~'~ L!'~t'l~!''''I'I'!~'I'l'!'~''~I'!'~'''I''''I'~L'''~'Y'I'!'t~'''t'I'~!'I'''~`'~'''''''!''I'! ~ ~ ~| h I Ih Tj i T ~ |~ l I I P j T ~ i T :::::::::::::::.:~..b~j:::: :..:.:.:~:.:.:~ :~:~-~..:~:~...:.y :~.::.:::~.:.:.: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::....::::::::::::::::::::::::: '~ 't~'d t~ n-ot~ p- -ol-l-c)- e -p-love-d ~ p-e-nslo-d S--~-md tne ~l~A ano maKl' -a~ oin-e-r~ aD~ ::::::::::::::: ::::::::::::::::::::::::: ::::::::::::::::::: ::::::::::::: :::::::::J:::::::::: :::::::::::::::::::::::::::: :::::::::::::: :::::::::: :::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: : :::::d::::::::::::::::::: :: ::::1:::::::::::::::::::::::::::':::::':':::':':::':'':':':':':::' :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:~- T^:T.-~-:-:-:d :~:~-:-:-~-~-~-I:-:-:-~-~-~-J ^-:r~-~-~-: - T -:-:-~-~-~-1:-~-~-~-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:~-~-:-:-:-:-:-:-~:-:-:~-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: .................................. ~ L".L= "I I ~ tW~=I ~ V V ~ I 1 ~ l ~ l ~ l l ~ ~t l~l ~ l l ~ I :2:2:2:2:2:2:2:2:2:2:2:2:2:2: 4:: 1:l: t: ::: :1:1: i T .::: : ::: : T:: d 1: :I T: t ' '~''T r' ' ''' '' ;' ''' i' ''' ' ' ' ;'' ' '''' '' '' ''I' t i 1 1 U 1 1 1 = ~ U 1 1 1 1 =~' ~ LO 1 =. 1 I V ~ ~ U ~ L 1 "~ L~ ~ 1 ~ VI 1 1 ~1 I V "~. ~ =t 1~1~1 1 "1 1 1~ U 1 1 L ~11 1 I I A ~ ~X A ~ ^ 1 ' ' ' ' ' ' ' ' ' ' ' ' ' ' `1 't~' ' ' !' '~'1'i' ' ' ~ ' ' ' ' `1 't~'1 ' !' ' ' 'i'1 ' !' - .-' ' ' V l 'l ' ' ' 'l ' ' ' ' ' '~ !' 'l ~' ' 'i'V l:~l:'w'l:':!':':':~'i':':':'V:: :~:~:L! l: :! :l~::: ~:! :~:: :V:!: l:l:~:: :~ ~. . ~' ' ' :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: . . . ...... ~ ...... .......... ......... ............................................................................................................................. ............ ~ t'entS""'OT""'!"d 'd 'd ' $ d-'d '-'-'d enSl'On''g'!$lr!0ul!-o~ S""are""'t ep-'t l'nq""'l'n'em"'aS'''d -q- d-'l-ar~ l'd '' '~

OCR for page 61
DATA NEEDS 127 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: - lmoulallon turrets on onus enor 1naor aarnmas Lo em ~ ~ ........................................ A ....................... ................................. ... ... ......... ....... ............ ................. .............. .... By. ....... ........... ::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ................ - . - ~ ....... ......... co-m-pa en ea-m-l-n-as~ e-oo-r so off m-a--e-c ~ coup -es lint t 1e calm 1 l ad -hi alas 1~ ~ ~ -~ waling exact ye malc-n-ea~ l-~-~ reco-ras~ ~-e~ To-u-no~ Doing .............................................................................................................................. - ~ u--nc-e-r epoding anD ~rrepoding pmolems Me Tuhner Determines tnat ::::::::::::::: It:::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::: :::::::::::: ::::::::::::::::::::::::::::::.:::::::::::.:: :::::::::::::::::::::::::::::::::::::::::::::::::::::: ~ knee -ensues Flu-- earl so l-m-p-ulal-l-on~ p-roc-ea-u- es Tor missing reports Inc ease-a Q . . . . . . .... . ~ . u e.-s.~.ion~.vr.om.ing--- ~-~.Is---on---c m picy.men-~---- Janus expense v. e---q-u-es~ .... ............... , - - . . i .. ~ t Io-n-nal--re~ tem-l-n-~ l-n-c-l- ulna :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: - , :::::::::::::::::::::::::~:::::::::::::::::::::::::::::: ............................................................................................................................. ......... I !~^ To r=~Imn To" ~ - ` ^' l==TI^n RIO! A.-= TO r.-=TI= ~one nOmO ~ n =mO ':':':':':':':':':':':':':':'w'w'w'~:':':':~ .~.-.:':':':'I:'~:~:~:~:!'~:I':'l:':':':~!':'l:~':':':'~:!':':':':~:':':':~:~:~'~'l:~:!':'l:'l:':l~l:'l:':~:':':':~.-.:':':':'I:':~:I':I'~:~:':':'w'I:':I'~:I':'I:y:~:':':':!':l':'l:':':':~: I :'1:':1'~'1 ~ ' e'' to a e''' -so d'-'-ta' i-l-itate~ ' es'' ' -' ' em O-' We'd e '-'-' t~ ' ' ti'' ~ '' h' ' Be' ""e'l*' ''I ........................................... ........................................................................................................................................ .................. ............................................................................................................................... . , , ~ -e-a more e-p-o-ns~ -room women wo-rK'ng ................. -. - - - - - - -. ~ lOem as nol ln lHe la~or Iome ~se oT me new quesTIonnalre anu uaTa .............. ~.~.''''' t'r' ' '''' '' ' ' ^'i''''' ' ''' t~' ' i' ^''' '' ' ' '''~' ''''I' ^' ''''t' 'r.' ' ''' +' 't.'' t'r' ' '''1' ' ' '''~' 'lr''' ''''''''''''''' ......... ''''''~'l 1~ ~ I'~t' 1 ''~'E'V~ U I '=O'''" I' l'=. ~ L=. "'' MV'I'I'I'~''~= V'''t=~V I''''t'V I'~.''= L"'L'I ~ ~ I'~O''l'~='~''l''''V l'I'V ~ '''''''''''''' - ~r see - ~ ~-.-Ae~Qr~.~ _rr.~r~ ~l er.-~^l ~ ~ t~^ ~ --~^r .^ l l ~ t ~l~ l l-- .-~.~-~-'&lV~-~'-'-'~V'-~,'-~-'~y=-'-'_~-'l-'Vl'--'~ '~'U'-I'-='~U'~ V'!~ ~2) e s ea e te i the Ma h 1994 CPS b a e e a d e ....................... ............. ............................................................... ............................................................................................... :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::: - I::::::::::::::::::::::::::~::::::::::::::::::::::::::::~::::::::::::::::::::~:::::.:.:::::::::::::::::::::::::::: ~ l~ ne~ wo-rSl~ cove-ma~ po-p-u--lmlo-n~ g-~-u-ps~ a-~ yo-u-n-g~ an-a~ m-l-a-a-l-e---age-a~ ~-l-ac-K~ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::: ::::::::::::::::::::::::::: ::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ~ me-n--:~ co *e-~ge~ ra-les~ are~ ~ l~ p-e-rc-en-l~ I--~ pe-me-n-~ ~--f~ p-e-rc-en-l-l~ a-n-u~ ~ ...................................... - - - - - - -. - - ' ' ' ' ' ' ' ' ' ' ' ' ' ' '~= I '~='I' I L - ' ' ' ' l' '=O'~ Ll'V'= i'V '- ' ' ' ' 'I ~'I' ' ' ' '~ I=~ ~' ' ' 'I'l' l'= I 't' ' ' ' "~'=~ ' ' ' '~ V.-~ - '- ' ' ' ' = - ' ='= -' ' ' ' '~V '~ - - ' ' ' 'M'1 'l'~' ' ' ' ' ' ' ' ' ' ' ~ a a a i il I I .... ::::::::::::::::::::::::::::::::::::: :::::::::::.::::::: ::::::::: ::: ::::::::::: :::::::::::::::: : ::: ::: : : :::::::::::::: ::: ..... 2''''''''~ ^^'''~O t^'O'''l ^''tn^'''N'n'~'r.-^n'''1''a"~''' ''---'w'`-'''~'M'M ''` I' v v V'''''''B'' n ^''~'t ^^'rt'''' ''M'^ n ^'r~'l'l' t'' r O' r^''''''''''''''' ::::::::::::::~:~:::I:~:L~:::I I::!::Lt:l~:::!:V I:~I:~I::!::::I::~:V~:: *:!::::V::~!:I:~::~l:t:::t:::):.::::::1::1::!~:::~:!~:~:1:1: y:::~:~l::l - .:l :~l:l:y:::l: ~ b'ed'''"""' ' ' ' "a' '' """' ' s'""""""94"""' ' '' ' " '""""'96"""' e" ' '' '""""'93""" ' '' e" '""" .................. ...................... ~ ................. ............ ~- j ~......... .... . ~ ce-n~ res-n-ecilv.-elV ~orolac~men olac~women nonolac~men ano non ~ 1 ~ :::::::::::::::::::: ~: t:::::::::::::::::::: ::::::::::::: ::: ::::: ::: ::: ::: ::::::: ::: 1 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::-:::.::::::::::::::::::::::::::::::::::::-'::::::::::::::::::::::::::::::::::::::::::::::::::::::-::'-'::::::::::::::::::::::-:::: ~ $ tl ~ h-- e ti- -t-d ~eae t ~ f~hi- h ~ t-han~ ................................................................................................................................ . - . - *::::::::::::::::::::::::::::: ~ pe-me-nt---a-u-e---to~ ag-e~ repo-~-n-g~ e-~-ro-~-*~ ~ee~ Te~---To-~---a-eT-I-n-l-l-l-on~ a-n-a~ m-eas-u-re~ . ... .. .... variations. In the past 10 years, federal statistical agencies have made increasing use of techniques from cognitive psychology to study in greater depth the ways in which respondents react to and interpret specific question wording. The results of such methods, which include one-on-one sessions in which a researcher probes the respondent after each question to ask what he or she had in mind when answering, have often shown startling differences in perceptions between respon- dents and survey personnel (see Jabine et al., 1984~. Aggregate Comparisons of Two or More Surveys Comparing aggregate estimates from one survey with aggregate estimates from another survey that is believed to be superior can provide an overall measure of data quality. For

OCR for page 61
28 ASSESSING POLICIES FOR RETIREMENT INCOME example, as discussed above, estimates of household wealth from such surveys as HRS or SIPP have been compared with estimates from the SCF. Another ex- ample is comparing estimates of retiree pension and health care benefits from the March CPS income supplement with estimates from the detailed supplements that have been conducted occasionally on these income sources (most recently in September 1994; see Pension and Welfare Benefits Administration, 1995b). However, aggregate comparisons do not generally shed light on the sources of error in survey estimates. Also, they need to be carefully made to ensure that definitions of the reporting universe and data items are comparable between the surveys being compared. Aggregate Comparisons of Surveys with Administrative Records Data Sur- vey and administrative records comparisons are often viewed as a preferred method of measuring overall data quality, on the assumption that the administra- tive records estimates represent "truth." For example, validation studies of the quality of income data in such surveys as the March CPS and SIPP have used estimates from IRS tax records, food stamps and other program records, and the National Income and Product Accounts (NIPA) as benchmarks. However, such comparisons often require extensive adjustments of the ad- ministrative sources, which cannot always be completely made, for consistency of coverage and definitions with the survey data. Thus, comparing NIPA and survey income estimates requires adjusting the NIPA estimates to exclude in- come of institutionalized people, Armed Forces members overseas, and others who are not covered in household surveys (including nonprofit institutions in some cases). In another example, comparing the percentage of private wage and salary workers who participate in employer pension plans between the Form 5500 data series and the periodic supplements to the CPS on pensions requires several adjustments (see Belier and Lawrence, 1990~. The two series do not include exactly the same types of pensions; also, the Form 5500 series includes nonvested participants who left their jobs less than 1 year previously, and it double counts workers with more than one job in which they are covered. Finally, administrative sources are not always error free. For example, there is evidence that earnings are underreported to assistance program caseworkers, which suggests that household surveys are not necessarily inaccurate when they find higher proportions of public assistance recipients with earnings than shown in case records. Also, Medicare records are not an entirely accurate representa- tion of the older population, given the problem of phantom enrollees (records for people who have already died). Microlevel Comparisons of Survey and Administrative Records Exact- match files make it possible to carry out detailed validation studies that decom- pose overall error levels into specific sources of error, including overreporting, underreporting, misclassification, erroneous imputation for nonresponse. Again,

OCR for page 61
DATA NEEDS 129 care needs to be taken to assure comparability of universes and data items: for example, not everyone is required to file a tax return. Because of confidentiality restrictions, the opportunity for microlevel error analyses has generally been limited to federal statistical agency staff. One analy- sis by outside researchers is Herzog and Rubin (1983), who studied the quality of March CPS Social Security benefit imputations with the publicly available 1973 CPS-SSA-IRS exact-match file. David et al. (1986) carried out a similar study of earnings imputations with a 1981 CPS-IRS exact-match file that they used while working at the Census Bureau as special sworn agents. Validation Needs To improve the capability for accurate modeling and analysis of retirement- income-related policies and behaviors, validation studies of key data sources should be carried out on a regular basis. Such studies can provide important feedback to data collection agencies to improve data quality at the source. They are also needed to enable researchers and policy analysts to determine appropri- ate strategies to compensate for data problems in their models. For these pur- poses, it can be useful to develop data quality profiles that are regularly updated as new information becomes available. Quality profiles bring together the results of validation studies for a particular survey or administrative records system into a comprehensive document that describes sources of error and their magnitudes, where known, and that identifies areas for which more validation work is needed (see, e.g., Jabine, King, and Petroni, 1990, which is a quality profile for SIPP.) Several kinds of data validation studies could be useful for retirement- income-related databases. Comparing CPS, SIPP, and HRS Reports of Pension Participation SSA recently completed a comparison of the May 1993 CPS pension supplement with 1993 data from the 1992 SIPP panel, finding that participation (coverage) esti- mates in the two surveys are almost identical (lams, 1995~. A similar analysis should be performed for all three surveys for the HRS age cohort. Comparing Household Survey Reports of Pension Participation with Esti- mates from Employer Administrative Records Aggregate comparisons, such as the study by Belier and Lawrence (1990) of the CPS pension supplements and the Form 5500 data series, should be carried out on a regular basis. More work is needed to improve the validity of such comparisons to account, for example, for worker participation in more than one plan and in plans of more than one em- ployer. Microlevel comparisons of household survey reports of pension plan provi- sions with employer records are possible and should be carried out for sample members of HRS, although the quality of the analysis may be affected by the

OCR for page 61
130 ASSESSING POLICIES FOR RETIREMENT INCOME relatively low rate of employer response. About 25 percent of sample members' employers did not respond to the request for Summary Plan Descriptions, and another 10 percent provided inadequate information with which to code relevant pension plan features. This level of employer response is typical of the experi- ence of other surveys that have requested the descriptions, such as the 1989 SCF and 1989 NLS-Mature Women (see Juster and Suzman, 1995:44-45~. Comparing Household Survey Data on Income and Assets Across Surveys and with Administrative Records Comparisons should be regularly performed of household survey reports with other surveys (e.g., comparing wealth estimates from HAS/AHEAD or SIPP with the SCF) and with NIPA and other administra- tive records sources (e.g., income tax records). Such comparisons, particularly with administrative records, require considerable care. With regard to pension income, a major issue is the treatment of the rapidly growing phenomenon of lump-sum pension distributions, which are treated dif- ferently in different surveys and records. Lump-sum distributions are included in the NIPA accounts and in income tax returns; according to the income concept of the March CPS, lump sums are not to be reported; SIPP has a separate category to report lump sums of all types; and HAS/AHEAD has questions on several types of lump sums, including pension distributions. Comparisons of March CPS, SIPP, IRS, and NIPA data suggest that some CPS and SIPP respondents may be reporting lump-sum pension amounts as regular income, but the extent to which this happens is not clear (Coder and Scoon-Rogers, 1994:21-24; see also Schieber, 1995~. Careful analysis of pension income reporting in the March CPS and SIPP in comparison with HAS/AHEAD for the HAS/AHEAD age range could be helpful, as could cognitive research with respondents to determine their knowl- edge of types of pension income and, in particular, whether they distinguish lump sums from pension distributions that are spread out over time. To make house- hold surveys more useful for retirement-income-related analysis, it would clearly be desirable to obtain as complete reporting as possible of both regular and lump- sum pension amounts. To the extent that these and other validation studies identify serious data quality problems, behavioral and projection models will need to be adjusted or their results qualified in an appropriate manner. For example, some microsimu- lation projection models have a provision to adjust March CPS income data for comparability with NIPA estimates. Such adjustment procedures must be care- fully worked out, not only to be sure that the NIPA estimates are in fact compa- rable with CPS income concepts, but also to preserve key relationships among income amounts and other variables.

OCR for page 61
DATA NEEDS 131 Recommendation 13. Budgets for retirement-income-related surveys should include suffi- cient resources for regular evaluation of data quality. Evaluation methods include reinterviewing subsamples of respondents to measure consistency of reporting; experimentation with alternative question wording to identify possible reporting problems; and comparing survey estimates with adminis- trative records to determine the completeness and accuracy of survey re- porting, taking care to adjust for differences in definitions and other aspects of the two sources.