Microsimulation Modeling of Health Care, Retirement Income, and Tax Policies
Our discussion to this point has focused largely on the problems of modeling income support programs such as AFDC and food stamps. Microsimulation techniques are also appropriate and have been used extensively in other social welfare policy areas. Many of the issues that we raise with regard to data inputs, model design, and computing technology are common across policy areas, although each area presents some unique features and problems. In this chapter we briefly discuss special issues in microsimulation modeling of health care, retirement income, and tax policies. Because we did not review the models and data in these areas in as great depth as those in the income support area we make few specific recommendations; instead, we do raise issues that we believe are particularly important to address.
One general question that arises is the relative weight to give to investments in microsimulation models for different policy areas. Because we do not pretend to have any particular expertise in foreseeing the future mix of policy issues, we cannot offer unequivocal advice on this question. We note, however, that health care policy is an area of growing importance because of the escalating costs of providing health services and the evidence of glaring gaps in the health care system, such as the large population not covered by private or public health insurance. Moreover, as we indicate below, available data, research, and models for health care policy analysis exhibit many inadequacies relative to the information needs.
Yet we believe it would be unwise to concentrate investment resources on any one set of issues. Welfare policy provides a cautionary example in this regard. After the collapse of the Carter administration's push for the Better Jobs
and Income Program in 1977-1978 and the subsequent focus of the Reagan administration on restricting welfare benefits, one might well have concluded that capabilities for modeling significant welfare reform initiatives would be of little importance. That conclusion would have been wrong, as evidenced in our review of the policy debate that led to the Family Support Act (FSA) of 1988 (see Chapter 2). Indeed, failure to invest in improvements to models and data handicapped the ability of policy analysts to use microsimulation techniques to develop estimates for many of the proposed welfare policy changes considered in the FSA debate. We do not expect that the FSA represents the last word in welfare policy, either. Hence, we see a continued need to scrutinize income support models and data and to determine ways in which they can be improved.
Similarly, we see the need to scrutinize models and data for retirement income and tax policies, as well as for health care policies. Indeed, the problem is not so much to pick policy areas for investment—all of them are and will continue to be important—but to discern particular aspects of each broad area that are likely to assume priority in the policy debate. For example, the need in the FSA debate was for models that could link the AFDC program with new initiatives such as child support enforcement, job training, and transitional assistance programs, tasks for which the existing models and data were not well suited.
Although there are many sources of information that can help agencies anticipate future policy proposals, there is no crystal ball that will furnish them with infallible forecasts for guiding their investments in policy analysis tools. The difficulties of predicting the policy agenda underscore the importance of investments that are aimed at improving the overall capabilities of microsimulation models (and other policy analysis tools) for flexible, timely, and cost-effective responses to changing policy concerns. To achieve this goal, whether for income support, health care, or any other policy area, databases need to be broad in scope, models need to follow good design principles and practices, and agencies need to find ways to further fruitful interactions between policy research and modeling.
HEALTH CARE POLICIES
Some of the reasons that health care policy issues are of continuing and increasing concern to decision makers are evident from the following selected indicators:
Total public and private spending for health care in the United States, which currently amounts to more than $600 billion, increased from 7.3 percent of the gross national product in 1970 to 11.6 percent in 1989. Over this same period, the proportionate share of national health care costs assumed by the public sector increased from 33 to 40 percent; the costs of the federal Medicare program rose by 300 percent (in real terms) to $102 billion, and the costs of
federal-state public assistance programs for health care (chiefly, Medicaid) rose by 215 percent (in real terms) to $67 billion.1
While the consumer price index (CPI) increased overall by 200 percent from 1970 to 1988, the medical care component of the CPI increased by 300 percent over the same period (Bureau of the Census, 1990b:Table 762).
In the fourth quarter of 1988, an estimated 31.5 million people, or 13 percent of the total population, were not covered by health insurance of any kind, either private insurance, Medicare, or Medicaid (Nelson and Short, 1990:3).
Over the period 1984-1989, total diagnosed AIDS cases increased from 4,000 to 78,000; federal spending for AIDS rose from $0.06 to $1.3 billion; and state spending for AIDS rose from $0.01 to $0.2 billion (Bureau of the Census, 1990b:Table 188).
A recent microsimulation study projected that, over the next 30 years (1986-1990 to 2016-2020), the elderly (people age 65 and older) will increase from 31 to 50 million; those elderly receiving long-term nursing home or home care services will increase from 6.3 to 10.4 percent of the total; total public and private expenditures for nursing home services will increase by 197 percent, to $98 billion; and total expenditures for home health care services will increase by 154 percent, to $22 billion (Rivlin and Wiener, 1988:10-11) (all dollar amounts in constant 1987 dollars).
These indicators and others underscore the policy interest in the health care area.2One pressing set of issues revolves around how to manage and contain what appear to be runaway costs for medical services. These costs are driven by many factors—ranging from the development of expensive new medical technology and treatments, to demographic and socioeconomic changes in the population, to the demand incentives for medical care that result from public and private health insurance programs. Another equally pressing set of issues revolves around how to ensure that people who need health care services will have access to them at a reasonable cost.
Before proceeding, we should make clear that our discussion of health-related policy modeling is limited to cost and coverage issues pertaining to the provision of health care services. Another topic that we did not take up but that deserves serious consideration is the use of policy analysis tools for modeling the relationships of health care interventions and other determinants of health status to health outcomes in the population and for estimating the cost-effectiveness
of alternative treatments and the feedback effects on overall costs of the health care system. It is clear that expenditures on health care may not always translate into improvements in health. (Thus, although the United States spends a higher proportion of its GNP on health care than other industrialized countries, it ranks lower than many countries in such health indicators as infant mortality and life expectancy; see Bureau of the Census, 1990b:Tables 1440,1444.) Hence, it is important in assessing alternative health care policies to consider not only the costs in terms of payments for services, but also the costs and benefits in terms of the effects on health outcomes. This topic presents challenging modeling issues that we have not investigated, but it seems likely that microsimulation techniques, with their ability to model complex relationships and individual circumstances, could make potentially useful contributions.
We note in this regard that demographers have recently developed complex stochastic models of disease processes and disability states, using hazard techniques with longitudinal data such as the Framingham Heart Study and the 1982-1984 National Long-Term Care Survey (see, e.g., Manton and Stallard, 1990; Manton, Woodbury, and Stallard, 1990, 1991). (The National Institute on Aging has supported much of this work.) They have used the resulting equations to analyze a number of policy-relevant issues. For example, they have analyzed the implications for the age structure of the population and total versus active life expectancy (i.e., years free of disability) of alternative assumptions about the elimination of particular diseases or risk factors (e.g., smoking) in the population. They have also examined the savings in nursing home and home health care costs that might be achieved by eliminating such diseases as Alzheimer's. To date, researchers working in this field have largely applied their estimated parameters to cell-based models, such as life tables, to analyze alternative scenarios. It may well be that putting this type of risk-factor analysis in a microsimulation framework and, further, effecting a linkage with microsimulation models of health care financing and coverage issues could have potential payoffs for analysis of health policy issues.3
Microsimulation Modeling for Health Care Policy
Microsimulation has played a role in analysis of health care cost and coverage issues since the technique was first introduced to the political process. The RIM model was used in the late 1960s to estimate the costs and distributional effects
of alternative national health insurance programs (Orcutt et al., 1980:85). A Medicaid module was developed for the MATH model in the late 1970s (Pappas, 1980). A microlevel database of households from the 1976 Survey of Income and Education and the 1976 Survey of Institutionalized Persons formed the population component of the ASPE Health Financing Model that was used to estimate alternative national health insurance programs during the Carter administration (Office of the Assistant Secretary for Planning and Evaluation, 1981).
However, throughout the 1970s and early 1980s, microsimulation techniques were applied less frequently to health care policy analysis than they were to other social welfare policy areas. None of the microsimulation models developed for health care policy gained the kind of widespread use enjoyed by models such as MATH and TRIM2 for analysis of tax and transfer programs or models such as DYNASIM2 and PRISM for analysis of public and private pension systems.
Recently, the use of microsimulation techniques for health care policy analysis has shown signs of new life. A module in TRIM2 to simulate the costs and distributional effects of expanding Medicaid coverage was recently redesigned and updated (see Holahan and Zedlewski, 1989); in addition, work is under way, sponsored by the Department of Labor, to add capabilities to TRIM2 to model employer-provided health insurance benefits. A major expansion of the PRISM model was effected to simulate alternative financing programs for long-term care of the elderly (Kennell and Sheils, 1986; Kennell et al., 1988; Rivlin and Wiener, 1988). DYNASIM2 has also been used to look at longterm care issues. The developers of PRISM recently built the Health Benefits Simulation Model, a comprehensive model for the household sector designed to simulate health insurance coverage, health services use, total health care spending, and sources of payment among the noninstitutionalized population (see Chollet, 1990). CBO has developed a microsimulation model for simulating changes in Medicare benefits, based largely on Medicare administrative records, that was used to estimate the costs and distributional effects of alternative ways to insure against catastrophic health care costs under the Medicare program (Congressional Budget Office, 1988).4 CBO has also developed models of Medicare and Medicaid eligibility. The Health Care Financing Administration (HCFA) has sponsored work by the Actuarial Research Corporation to apply static aging techniques to update the 1977 National Medical Care Expenditure Survey (NMCES) and to use the resulting database to simulate policy issues,
such as the effects of extending Medicaid coverage. (Wilensky, 1982, originally proposed basing a comprehensive microsimulation model for health care issues on the 1977 NMCES.)
All of these health care policy microsimulation modeling efforts have dealt with the household sector and primarily with issues of expanding health insurance coverage and the associated costs to the federal and state governments for reimbursing medical care charges. Microsimulation-based models have also been developed to examine issues that affect the supply side of the health care market. For example, CBO developed a model, based on Medicare administrative records of payments to physicians, augmented with county-level data, that has been used to estimate the effects of changing fee schedules for different types of medical specialties and geographic areas (Congressional Budget Office, 1990). (Legislation to alter fee schedules, which is to take effect in 1992, was recently passed.) CBO has also developed a model, based on data for each of the hospitals in the United States, that is designed to estimate the cost and distributional effects of changing various provisions of the Medicare prospective payment scheme used to reimburse hospital costs. For example, the CBO model could determine effects by geographic area, type of ownership of the hospital (private, nonprofit, etc.), and hospital size (number of beds). The Health Care Financing Administration has a similar prospective payment scheme model.
In looking back over the past 20 years, however, it is clear that microsimulation models have played a distinctly subordinate role in health care policy analysis. Moreover, many of the microsimulation models that exist were developed on an ad hoc basis for special purposes and are neither well documented nor used outside the agency that developed them.5
Cell-based models, often with links to macroeconomic models, have played a much more prominent role in health care policy analysis. The Health Resources Administration supported development of a cell-based health care sector simulation model in the 1970s (Yett et al., 1980). The model, which never went beyond the prototype stage, represented an ambitious effort to relate demand and supply relationships in the health care market. It included submodels for projecting the population of consumers, classified by demographic and income categories; the supply of physicians, classified by age, specialty, and a few other characteristics; quantities and prices of physician services; quantities and prices of hospital services; and the supply of nonphysician health personnel.
The ASPE Health Financing Model was essentially a large cell-based
model. The microdatabase, developed from merging the Survey of Income and Education and the Survey of Institutionalized Persons, was projected forward in time by using static aging procedures and then aggregated into several thousand population cells, classified by family income; employment of adult members of the family; primary insurance coverage; family size; and age, sex, and disability status of the person. Data on health care utilization from the National Health Interview Survey were matched to the population data on a cell-by-cell basis. Finally, data on health care expenditures by type of medical service, source of payment, and population group were assembled from a variety of administrative sources and, in turn, matched with the population cells to generate the medical expenditure profile in the simulation year under current law. Simulations of alternative profiles altered the cell values to reflect a number of direct and indirect effects of policy changes, such as changes in the level of utilization of services in response to changes in cost-sharing and other patient payment provisions. Often, the population microdata records were retabulated to support simulation of alternative policies that affected different subgroups, but the core of the model operated on a cell basis.
The Macroeconomic-Demographic Model of Health Care Expenditures, developed for the National Institute on Aging by Lewin/ICF, Inc., is a large cellbased model for projecting health-care costs over the long term (Anderson, 1990; Cartwright, 1989).6 The model includes a macroeconomic growth model with two goods—investment and consumer goods—and two factor inputs—labor and capital services. Interacting with the macro model are large cell-based models of population growth, the labor market, pension benefits, family formation, consumer expenditures, housing demand, health care expenditures, and health care benefits. As an example of the model's size, health care expenditures are estimated for 3,136 family groups, classified by family size, sex of head, age of head, race of head, geographic region, urban or rural residence, and whether or not covered by private health insurance. The equations used to estimate expenditure shares, labor supply, and other parameters in the model were developed by using microdata, but the model itself operates on a cell basis.
Issues in Modeling Health Care Policy Alternatives
The limited application of microsimulation modeling in the health care policy area and, indeed, the failure of any particular model(s), regardless of type, to gain widespread use for health care policy analysis result from the complexity
of the issues involved and the magnitude of the modeling task. Moreover, the complexities and broad scope of health care policy issues have led to fragmented efforts to develop needed databases and research knowledge, which, in turn, have handicapped model development efforts.
Why is health care policy so daunting? For one thing, many actors are involved in health care, including:
patients and prospective patients, in both the household and the institutionalized population;
informal caregivers, such as relatives;
doctors, hospitals, nursing homes, and other service providers;
state and federal insurers and regulators; and
private profit and nonprofit insurers of several types, ranging from traditional insurers that reimburse on a fee-for-service basis to health maintenance organizations and other prepaid medical service plans.
The interconnections of the various actors exhibit a bewildering variety that makes it difficult to assemble relevant information or even determine an appropriate unit of analysis. For example, an elderly patient, during the course of one illness, may be treated by several different specialists in one or more hospitals or other service centers and may obtain reimbursement from both Medicare and a private Medigap policy and also pay some costs directly. For some health care policy questions, it may be important to have sufficient information to analyze the spell of illness (or another broad measure) instead of working with a narrower unit of analysis such as doctor visits or hospital days. Yet, in our example, none of the service providers or insurers is likely to have complete information about the patient or about the procedures or costs involved in treating the illness (e.g., Medicare reimburses doctors and hospitals separately).
Mirroring the complexity of the health care sector, there are many federal agencies involved in health care data collection, research, and policy analysis, often with overlapping mandates and interests that do not always make it easy for coordination or progress toward a common set of goals. Agencies of the Department of Health and Human Services (HHS) with important roles in this regard include ASPE, the Health Care Financing Administration, the Agency for Health Care Policy and Research, the National Institute on Aging, and the National Center for Health Statistics, among others.
It is certainly possible to model a particular class of actors and consider policy questions that directly affect that class—for example, simulating the extension of Medicaid coverage to a broader population or the effects of higher coinsurance rates on Medicare beneficiaries. However, unlike the case with a program
such as AFDC, it is clearly perilous to ignore first-round behavioral responses to health care policy changes. For example, research has shown that people alter their demand for medical services in response to changes in coverage, coinsurance rates, and other provisions of health insurance plans that affect the price of services.7 Moreover, demand responses can involve large numbers of people and thus have large effects on program costs. In contrast, although features of income support programs can certainly affect aspects of behavior such as labor supply, research has demonstrated relatively small effects for eligible groups, and these groups represent relatively small proportions of the total population.8
It appears likely to be perilous as well to ignore second-round effects in modeling health care policy changes. For example, changes in insurance coverage may ultimately lead to changes in hospital pricing policies, such as increasing prices for services that tend to be used more heavily by insured patients (see Grannemann, 1989:8-9). As another example, changes in public health insurance benefits are likely to affect the market for private health insurance and thereby influence total and public-sector health costs in unforeseen ways.9 As still another example, it is widely acknowledged that physicians have a large influence over levels and costs of medical services because of such factors as the limited information available to patients on prices and services. Hence, it is important to consider physician behavior in evaluating alternative health care financing and payment policies.
The research knowledge that could support modeling the complex health care sector has many inadequacies. There are many gaps and deficiencies in understanding the behavior of consumers and providers of health care, including evidence on which to base estimates of physician response to changes in payment schedules. There are some studies showing that the decisions of physicians to participate in Medicaid are responsive to the level of Medicaid reimbursement rates. Other studies suggest that state restrictions on Medicaid payments may reduce the number of visits to physicians in private offices but increase visits to other, more expensive, ambulatory care providers, such
as hospital emergency rooms. However, the evidence for these conclusions is very limited (see Grannemann, 1989:15). In general, there is insufficient evidence to choose among models of physician behavior, for example, that physicians tend to act according to the "target-income hypothesis," whereby they alter prices or quantity of services in order to achieve a target level of income; or that physicians in metropolitan areas are in a position to raise prices more freely compared with physicians in rural areas, because of differences in consumer access to information (Grannemann: 1989:13). Yet in simulating the effects of the most recent changes in Medicare physician payment schedules, it would have been very desirable to have good estimates of the extent to which physicians would shift their activities in favor of more generously reimbursed procedures; of how their responses, in turn, would affect usage and costs under Medicare; and of what the impact would be, in the long run, on physician specialty choices.10
Similarly, there are competing models of hospital behavior—including models based on profit maximization, utility maximization, or physician control (see Grannemann, 1989:7-ll)—but the available evidence has not established the superiority of any particular model of hospital behavior. For yet another area, very little is known about the potentially great impact that current research on the effectiveness of alternative medical treatments could have on delivery of health care services. Such research may well lead public and private insurers to deny payment for services that are deemed ineffective or not worth their cost. Such actions would presumably lead to responses on the part of providers and patients that, in turn, would affect usage and costs of medical services (Grannemann, 1989:36).
There is no lack of information sources pertinent to health care, including many surveys and administrative records.11 However, there is no database, such as the March Income Supplement to the Current Population Survey (CPS) or the
The new payment system, instead of reimbursing doctors' "customary, prevailing, and reasonable" charges, assigns specific amounts to specific procedures based on assumptions about their relative value. The result of the system chosen is expected to raise receipts for internists and general practitioners and lower those for surgeons (Congressional Budget Office, 1990). The CBO analysts who produced estimates for the reimbursement changes did not attempt to include long-range responses in their microsimulation model. They did include a first-round behavioral response, with the assumption that physicians would strive to meet a target income. Specifically, they assumed that 50 percent of a physician's potential income loss due to the new Medicare fee schedule would be offset by an increase in volume of services, and 35 percent of a physician's potential gains would be offset by a decline in volume (Congressional Budget Office, 1990:Appendixes; see also Chollet, 1990).
See Gilford (1988: Appendix C) for descriptions of some of the major health data sets; see Panel on Statistics for an Aging Population (1986) for an inventory of data sets related to the health of the elderly.
Survey of Income and Program Participation (SIPP), that provides, on a regular basis, the majority of the variables needed for evaluating alternative health care policies (although the CPS and SIPP, particularly the latter, do provide relevant information). This is true even in models confined to particular sectors and types of issues, such as insurance coverage of the population. Hence, although microsimulation models typically face the task of generating a suitable database by combining variables from multiple sources through techniques such as imputation or statistical matching, models for health care (including cell-based models) face an unusually daunting database construction task.12
There are several examples of the database problem. The current TRIM2 Medicaid module is designed to answer fairly narrow questions about the subgroups of the noninstitutionalized population that will benefit from extending coverage of the Medicaid program to more low-income Americans and the government budgetary implications of such an extension.13 Yet it requires the use of several data sources. The principal sources include the March CPS (supplemented with other variables, such as deductible expense imputations based on Consumer Expenditure Survey data) and the Medicaid "tape-to-tape" data from state administrative records.14 The March CPS serves as the primary database for simulating Medicaid eligibility on the basis of AFDC and SSI, while the Medicaid data provide a basis for estimating enrollment (participation) rates and medical care utilization and expenses.
The ICF/Brookings Long-Term Care Financing Model (LTCFM), which is built on top of the PRISM model, is designed to answer questions about costs and coverage of alternative ways of financing long-term nursing home and home health care for the elderly. The LTCFM submodel uses the basic PRISM model, with some modifications, to project the numbers and characteristics of the elderly with regard to family structure, employment, income, assets, and
private health insurance coverage. In addition, the submodel then simulates disability status of the elderly, their use of and expenditures for nursing home and home care services, and their accumulation and spending down of assets to gain Medicaid eligibility. Major databases used for the LTCFM submodel include the 1983 Survey of Consumer Finances, the 1977 National Nursing Home Survey, the 1982 National Long-Term Care Survey, and the March 1982 CPS. (The submodel was recently revised to include data from the 1984 SIPP panel, the 1982-1984 National Long-Term Care Survey, and the 1985 National Nursing Home Survey.) Many other data sources and studies were consulted in determining such critical assumptions of the model as the relationship of mortality and morbidity (the model assumes that people will enjoy longer but not necessarily healthier lives) and the projected rate of increase in nursing home prices.
Finally, the Health Benefits Simulation Model (HBSM) is designed as a comprehensive model of health care policies as they affect the noninstitutionalized population. The primary database for HBSM is the 1980 National Medical Care Utilization and Expenditure Survey (NMCUES), aged to approximate current levels of income, other socioeconomic characteristics, employer health insurance coverage, health services use, and average length of hospital stays. The aging process uses data from the March CPS, the National Health Interview Survey, and a Lewin/ICF survey of employer health insurance plans. Control totals from the National Health Accounts are used to calibrate the total health expenditures by service type and payment source estimated from the NMCUES.15 Data about characteristics of employer-provided insurance plans from the Lewin/ICF survey are appended to the NMCUES records through a statistical matching process. Other data sources that are used to impute needed variables include the 1986 Employee Benefit Survey of the Bureau of Labor Statistics, the 1977 National Medical Care Expenditure Survey (NMCES), and the March CPS.
As this brief review highlights, there is no single, regularly repeated survey or administrative records system that provides data on most of the needed variables for health care microsimulation models. The series of surveys of medical care utilization and expenditures—the 1977 NMCES, the 1980 NMCUES, and the 1987 National Medical Expenditure Survey (NMES)—come closest to filling the need. The 1977 NMCES obtained data over an 18-month period from about 14,000 households that were queried about income, employment, family structure, sources of health insurance, disability status, health services use and expenditures, and amounts for health care paid out-of-pocket and by third parties. Additional data were obtained for these households from surveys of their physicians and other health care providers, their employers, and their
health insurance carriers. The 1980 NMCUES was a similar but more limited survey, conducted over 15 months for a sample of 10,000 households. The sample comprised a nationally representative component of 6,000 households supplemented with samples drawn from Medicaid records in four states; see Subcommittee on Federal Longitudinal Surveys (1986). The 1987 NMES was similar to the original NMCES, with a sample of 14,000 households with oversampling of blacks, Hispanics, elderly people, low-income people, and people with functional limitations. Like NMCES, the NMES also included ancillary surveys of employer health insurance plans and medical providers listed by NMES households. Finally, the 1987 NMES included an institutionalized component, comprising 13,000 residents of nursing and personal care homes, psychiatric facilities, and facilities for the mentally retarded.
Yet the NMCES-NMCUES-NMES family of surveys does not satisfy the data needs for health care modeling: the sample sizes are small and, with the exception of NMES, cover just the household sector, excluding institutionalized people; the surveys have been conducted infrequently; and there have been long lags in making the data available for public use—data have just recently become available from the 1987 round.16 Other useful surveys have also been conducted infrequently. For example, the two most recent rounds of the National Nursing Home Survey were in 1977 and 1985; the next round is scheduled for 1991 or 1992.
Other surveys are limited in other ways. For example, the National Health Interview Survey is a large household survey that is conducted on an annual basis; however, it focuses on self-reported health conditions and use of health care services, and obtains only limited data on other variables that are needed for modeling such as income and health care expenditures.
Outdated information can present serious problems for health policy modeling, given rapid technological advances and changes in treatment that affect demand and costs. For example, the cost estimates originally developed by CBO for the prescription drug component of the 1988 Medicare Catastrophic Coverage Act, by using data from the 1977 NMCES and 1980 NMCUES, were much lower than those prepared after passage of the act by using data from the 1987 NMCES (Congressional Budget Office, 1989c). Given the uncertainties in the original estimates, Congress mandated that CBO reestimate the costs of the prescription drug component when the 1987 data became available. Comparison of the 1980 and 1987 data showed that prescription drug usage by the elderly and the average price paid had increased much faster than CBO had originally projected. CBO's original estimates had projected $6 billion in government outlays for covering prescription drugs and $8 billion for insurance premiums
paid by the elderly for the period 1990-1994; the revised estimates were $12 billion for the government and $9 billion for the elderly.
In terms of administrative records, there are a plethora of sources, including records of Medicare claims and payments maintained by HCFA; records of Medicaid eligibility, claims, and payments maintained by the states; records maintained by private insurers; hospital admission and discharge records; and so on. There are a number of useful files from these sources, including the HCFA Continuous Medicare History Sample, which is a continuously updated 5 percent sample of all Medicare claims records, beginning with 1974. However, users face many problems in gaining access to administrative records, in working with their large volume, and in developing integrated files of needed data from multiple sources. The task is particularly daunting in cases for which state data are needed, for example, modeling changes to the Medicaid program.17
There are a number of heartening indications of improvements in needed data for health care modeling. For example, HCFA is planning to inaugurate an ongoing Current Beneficiary Survey (CBS) of the Medicare population, which will consist of continuously updated samples of about 12,000 (mainly elderly) people, each of whom will be interviewed three times a year over several years. The CBS is intended to provide larger samples of the elderly than do most other surveys and to obtain additional information about beneficiary characteristics that the Medicare claims records cannot provide (Health Care Financing Administration, 1990).18 As another example, the National Center for Health Statistics is planning to coordinate its surveys of providers, including hospitals and nursing homes, and conduct them on a more frequent basis. As yet other examples, the March CPS, which added questions on health insurance coverage in 1980, added questions in 1988 to improve measures of health insurance coverage for children and for nonworking adults (such as retirees) covered by a former employer, while the Health Interview Survey recently added more detailed questions on household income.
Suggestions and Recommendations
There is clearly strong and growing concern about the need for improved models and data for evaluating alternative health care policies. And there has been visible progress on some fronts, notably in initiating programs to obtain better data sets for modeling and research purposes. At the same time, modeling efforts remain, for the most part, fragmented and limited in scope. There also appears to be a good deal of confusion about priorities and directions for health care policy modeling. This confusion is not surprising, given the wide range of health care policy issues and the diverse actors whose behavior must be taken into account. At present, various interagency coordination bodies are concerned with health data needs, including committees on data for the elderly and the disabled. However, there is no coordinating body that is focused specifically on data and research requirements for developing improved models for health care policy analysis.19 We do not pretend to have answers regarding priority directions for microsimulation modeling of health care areas. We do have some general views, and we propose an approach to help in the task of setting priorities.
We believe that microsimulation techniques have a useful role to play in answering many detailed questions about the effects of alternative health care policies. Yet we are also aware that microsimulation models can be costly and time consuming to develop and apply, particularly when first-round and even second-round behavioral responses have to be modeled in addition to direct effects. In our view, it is imperative for HHS to set up a department-wide coordinating and steering body to determine priorities both for microsimulation model development in the health care area and for needed data collection and research studies that will lead to improved models. This steering group should be established at a level that will give weight to its recommendations and should be led by an agency (or agencies) with a mission that is oriented primarily to policy analysis (rather than data collection or research). Given the sizeable needs of microsimulation models for data and research results, we believe that
the efforts of such a coordinating body will be fruitful for many other types of health care policy research and analysis as well.
Recommendation 8-1. We recommend that the U. S. Department of Health and Human Services establish a high-level, department-wide coordinating and steering body to set priorities for development of microsimulation models and related data collection and research needed for improved analysis of alternative government policies and programs for health care.
Overall, we believe that it would be unwise, at this juncture, for HHS to plunge forward with the development of large-scale microsimulation models for health care policies. There are too many data and research gaps to be filled in first. Moreover, the complexity of the modeling task in health care argues strongly that new model development (or significant expansion of existing models) should be based on the kinds of emerging innovative computer hardware and software technologies (described in Chapter 7) that promise to facilitate model validation and experimentation.
When the decision is made to proceed with model development, the general principles of model design and implementation presented in Chapter 6 will need to be rigorously applied. In particular, microsimulation models for health care policy should be developed with superior capabilities to facilitate validation. Long-term care and other health-related policy issues involve long time horizons, for which it is important to conduct extensive sensitivity analyses, as well as to prepare variance estimates, in order to establish a sense of the range of reasonable projections.20 Modifying physician or hospital reimbursement schemes, as well as many other health care benefit changes, can be expected to have first- and second-round behavioral effects, again necessitating careful and thorough validation.
For health care models, especially, model development strategies must not become overly grand. It is obviously important to have models that can take account of multiple actors and indirect as well as direct effects of policy
changes. However, we do not believe it is possible for any one model to achieve sufficient breadth and depth to serve the full range of health care policy analysis needs. As argued in Chapter 6, model developers must consciously trade off the scope of a model versus the detail of model components, and this principle is particularly important to apply for health care models. Clearly, the HHS coordinating body, and others involved in health care model development, must think broadly in conceptualizing the requirements for improved microsimulation models and must develop a comprehensive plan to guide their specifications for needed data and research. However, the actual modeling capabilities that ultimately emerge do not have to be—probably should not be—tied to a single model. It is likely to be more cost-effective to develop separate models for different sets of actors (such as patients or physicians), designed in such a way that they can interchange outputs and be used in a coordinated way.
Given, however, that investments in actual model development on a large scale are deferred until developments in data, research, and computing technology bear fruit, the question for the coordinating body becomes one of how to choose priorities for new research and collection of new or modified data. These tasks require careful deliberation, given their high costs in comparison with those for secondary data analysis.
There are many gains to be made from incremental strategies that seek to take advantage of existing data sources.21 For example, there is a wealth of administrative data on health care that could be linked with survey data. We understand that HCFA's plans for the Current Beneficiary Survey of the Medicare population include exact matches of the survey data with Medicare claims records. We urge that such linked data sets be made available for policy analysis and research purposes. We also note the utility of including questions for key variables, such as income, employment status, and employer firm size, in multiple surveys to facilitate relating them for modeling purposes.22 Recently, ASPE encouraged the National Center for Health Statistics (NCHS) to improve the income questions in the National Health Interview Survey so that those data can be more readily used in conjunction with other data sets. The Interagency Forum on Aging-Related Statistics has also drafted guidelines for including income items in surveys about the elderly (Income Working Group, 1990).
Although more frequent data collection entails costs, we believe it is important to consider increasing the periodicity of key data sets. For example, given the important role of private as well as public health insurance and rapid changes in medical technology and treatments that have major effects on health care usage and costs, we believe that conducting surveys of the NMCES-NMCUES-NMES type more frequently should be given serious consideration. Provider-based surveys, such as the National Nursing Home Survey, are also important to conduct on a reasonably frequent basis (as NCHS is currently planning to do).
Finally, just as a strategy of designing one grand health care policy model is not likely to be cost-effective, neither is a strategy of planning for one comprehensive health care survey, given the many different data needs and respondent burden. Hence, health care policy modelers will continue to confront the necessity to use multiple data sources and to relate them through the use of techniques such as imputation and statistical matching. Given this situation, we urge that HHS consider, periodically, sponsoring very comprehensive small-scale surveys that can be used to validate the quality of the imputations and matches performed by health care policy models (see Chollet, 1990, on this point).
It is clear that additional research is required to develop robust estimates of both direct and indirect effects of alternative health care policies. (Again, our discussion pertains to cost and coverage issues but not to the important topic of effects on health outcomes.) The question is where to start, given the range of actors and responses involved. Grannemann (1989) proposed several criteria for ranking research topics, including:
the estimated magnitude of the associated costs, in terms of the proportion of total health care costs accounted for by the actor whose behavior will be analyzed;
the estimated magnitude of the behavioral response, in terms of the proportion of medical care services that will be affected;
the extent to which new information is likely to improve prior assumptions about outcomes, that is, to reduce the uncertainty in the estimates of behavioral parameters; and
the congruence of the assumed behavioral effect with priority policy concerns.
Grannemann developed a matrix of scores on each of these dimensions for five types of actors—hospitals, physicians, nursing homes, insurers, and patients—and several dimensions for each actor (such as price and volume of
services for physicians, and usage, insurance purchase, and program participation for patients). For example, hospital services scored higher on the cost dimension than physician or nursing home services; patient use decisions and utilization review by insurers, which can apply to all services, scored even higher. On the dimension of magnitude of the impact, physician and hospital pricing scored high, while community-based alternatives to nursing homes scored low, reflecting the limited ability such programs have shown to affect patient and family choices regarding nursing home use. On the dimension of policy relevance, physician payment and utilization review (potentially resulting from effectiveness research) scored high, while hospital price responses, which would apply mainly to private-sector payers, scored low.
Overall, Grannemann identified six areas as priority targets for behavioral research: physician pricing responses, insurer responses to utilization review, patient use decisions, patient program participation, determinants of volume of physician services, and hospital admission decisions. Of course, one need not agree with Grannemann's list of factors or his assigned scores. Nonetheless, the approach of laying out formal criteria for selecting priority areas for research on behavioral responses to health care policies has a good deal of merit, and we recommend it as a high-priority task for the HHS coordinating group charged with planning the development of improved microsimulation models for policy analysis related to health care.
RETIREMENT INCOME POLICIES
Provision of adequate retirement income has been a concern of the federal government going back at least to the days of the New Deal. Over the years, the policy debate in this area has addressed both the benefits and the financing of the federal social security system. On the benefit side, the debate has considered coverage or entitlement (e.g., coverage was extended to federal workers just a few years ago); the level of the benefit replacement ratio (i.e., the percentage of earnings prior to retirement replaced by social security benefits); the retirement age at which benefits begin; the extent to which earnings after retirement reduce benefits; and many other issues. On the financing side, the debate has considered the social security tax rate, the level of income subject to tax, and the extent of income taxation of benefits. The policy debate has also addressed issues of private pension coverage and benefits, and how the private pension system meshes (or does not mesh) with social security and civil service retirement.23
The stakes in this area are high. To cite some figures: in fiscal 1988, payments to retirees under the federal social security system totaled $157 billion, and federal civil service and military pensions totaled another $36 billion (Bureau of the Census, 1991:Table 588). The sum of payments under social security (to retirees, survivors, and the disabled) and other federal payments to retirees and the disabled totaled $271 billion in fiscal 1988 and had climbed to $306 billion in fiscal 1990—almost one-quarter of total federal outlays (Congressional Budget Office, 1991:150, 152).
Microsimulation Modeling for Retirement Policy
The Social Security Actuary has for many years maintained cell-based models to project the people expected to be eligible for social security benefits and those expected to pay social security taxes and to estimate the balances between expenditures and receipts in the social security trust fund.24 These projections are typically made for as many as 75 years into the future because of the interest in assessing likely changes in the trust fund balances due to such factors as increased or decreased fertility and labor force participation rates.
Another largely cell-based retirement policy model is the Macroeconomic-Demographic Model (MDM), which combines a neoclassical macroeconomic growth model with cell-based models for population growth, family formation, labor markets, and pension benefits. The MDM model, which was developed by Lewin/ICF, Inc., for the President's Commission on Pension Policy and the National Institute on Aging, has been used to analyze the impact of the aging of the U.S. population on retirement income systems over the long term and the interactions of private pensions with social security. More recently, health expenditure modeling components were added to the MDM model (see Anderson, 1990).
Microsimulation models have also played an important role in evaluating alternative pension policies, particularly when the questions raised involve complex issues affecting particular population groups or the intersection of public and private pension systems. Because of the need to see how changes in retirement income policies interact with demographic and employment changes over the long term, simulations of pension programs have almost always used dynamic models that can project detailed individual and family histories over periods of 20-40 or more years. The two most heavily used models in this area are DYNASIM2 and PRISM.
The original DYNASIM model was completed in 1975, but it proved too complex for regular use. DYNASIM2 represents a revision, completed in the early 1980s, that was designed to be more streamlined and cost-effective. The redesign also focused DYNASIM2 on retirement issues, although the model has been used for other applications as well, such as an analysis of the implications of alternative rates of teenage pregnancy for government transfer program costs and an analysis of the demand for long-term care services over the period 19902030. The DNYASIM2 model has been applied in many different retirement-related analyses: for example, it was used to project the impact of earnings sharing proposals, which the Actuary's model could not handle.25 Agencies using DYNASIM2 for analysis of retirement income programs include the Congressional Budget Office, Department of Labor, and Department of Health and Human Services.
The PRISM model, which was first developed in 1980 to evaluate alternative national retirement income policies combining public and private pension coverage for the President's Commission on Pension Policy, has been used extensively since that time. It was developed specifically as a retirement income model and consequently does not have some of the features of DYNASIM2. For example, PRISM models childbirth, but it does not create records for newborns because they are not needed for retirement simulations. Likewise, PRISM does not include an educational attainment module, because it works with the population age 20 and older as of 1979, which has already completed sufficient education to support a simulation of the retirement age population through the year 2025. In the mid-1980s, PRISM was expanded to include a subsystem for modeling financing options for long-term care of the elderly (see Kennell and Sheils, 1986; Rivlin and Wiener, 1988).
Dynamic strategies for aging the initial database are at the heart of the DYNASIM2 and PRISM models. As noted in Chapter 6, dynamic aging is theoretically superior to static aging because it more fully maintains the covariances between age and other variables by applying transition probabilities for a large number of variables.26 In practice, dynamic aging is a much more complex process than static aging, and the quality of the resulting database depends heavily on the quality of the many transition probabilities that are used. (To keep the time paths for major variables in line with other accepted projections of these variables over time, dynamic models usually include a step to calibrate each year's results to assumptions used in other forecasts.) Although static aging is less complex, it is not likely to become a viable strategy for
modeling public and private pension systems because of the need for rich detail throughout the simulation period and not just at the end year. Detailed income, earnings, and employment histories are needed to evaluate alternative retirement income proposals, such as changing the replacement ratio or requiring portability of private pensions after a specified time period.27 Dynamic models also afford the capability to change assumptions about transition probabilities, as well as independent control totals, so that different scenarios can be played out over time.
The high degree of complexity of dynamic models means that they tend to be resource-intensive to develop, update, and apply. They are also subject to many sources of error because of their large numbers of parameters and, hence, difficult and resource-intensive to validate. Moreover, in applications that require long projection periods, as is the case for most retirement income issues, dynamic microsimulation models confront the inescapable problem that the quality of their projections deteriorates over time. Not only are errors in multiple sources likely to compound, but people are likely to change their behavior in ways that cannot be foreseen. DYNASIM2 and PRISM typically use aggregate population and economic growth assumptions from projections of cell-based or macroeconomic models to control their own projections, but, of course, errors inevitably affect the quality of the outside projections as well, particularly as the projection period lengthens.
Clearly, dynamic microsimulation models need to be reassessed periodically to determine if there are ways to take advantage of new computing technologies and alternative designs to enhance their cost-effectiveness and, even more important, to facilitate the task of validation and communication of uncertainty in the projections to decision makers. The kinds of validation studies that are needed include external validity studies that compare model estimates with measures of truth, sensitivity analyses, and studies to estimate variance from sampling error in the primary database and other sources (see Chapters 3 and 9). For a review of studies of the quality of the labor force projections from DYNASIM2, see Cohen (Chapter 7 in Volume II). Wertheimer et al. (1986) compared DYNASIM2 projections with those produced by the cell-based MDM. In that study, MDM outputs were used to control the DYNASIM2 projections, and the comparisons looked at variables not explicitly linked in the control process. Overgeneralizing from a very complex analysis, the models exhibited minor differences in projections of social security beneficiaries, somewhat greater differences in projections of social security benefits, and substantial
differences in projections of private pension recipients and benefits. However, few such studies have been done.
Suggestions and Recommendations
We have not reviewed in detail the components of the DYNASIM2 and PRISM models and therefore do not have specific recommendations for functions that may need improvement. Burtless (1989) comments that DYNASIM2 and PRISM simulate the effects of retirement policy changes on labor supply in terms of the effect on the decision to retire, but not in terms of the effects on work hours or earnings before or after retirement. For example, DYNASIM2 assumes that labor force participation ends once a worker retires. PRISM provides for partial retirement, but the level of pension income does not affect work hours or earnings. Burtless (1989) makes a more general observation that dynamic models, although they draw heavily on behavioral research for their transition probabilities, include very few feedback effects of behavioral changes in response to changes in government programs and policies. Because these kinds of effects may also be of policy interest, it could be useful to consider expanding the models' capabilities in this direction.
We see two fundamental requirements for dynamic models of retirement income policies—first, the need for linked survey and administrative data that are periodically updated to provide the initial database; and second, the need for continual modification from updated research results of the transition probabilities that are used to project the individual data records in terms of marital, childbearing, labor force, and other behaviors.
Models such as DYNASIM2 and PRISM rely on social security administrative records of earnings histories linked with the cross-sectional demographic and socioeconomic information in the March CPS as the foundation of their longitudinal databases for simulating retirement income policies. (DYNASIM2 uses the March 1973 CPS-SSA exact-match file; PRISM uses the March 1978 CPS-SSA exact-match file, which in turn, is matched with the March 1979 CPS and the May 1979 CPS.) Because confidentiality concerns and resource constraints have forestalled development and public release of any CPS-SSA exact-match files in the 1980s, models such as DYNASIM2 and PRISM are working with increasingly outdated initial databases. They have to simulate more and more years of historical data before they can even begin projecting their databases into the future.
We believe that it is vitally important for resources to be found to produce updated exact-match files of social security and survey data and for ways to
be devised to make such files available for modeling and analysis of retirement security options—a policy area that will be of continuing importance as the population ages. Performing matches with both the March CPS and the SIPP should be considered.28 SIPP has demonstrated excellent performance in obtaining social security numbers from respondents, and, indeed, the survey was designed with the expectation that administrative data would be used in conjunction with the interview data.29
SIPP also regularly includes needed data on investments in individual retirement accounts (IRAs) and private and public pension plan coverage. (The CPS has also periodically included pension supplements.) SIPP has also experimented with obtaining detailed information about employer-provided benefits, including pensions and others. This information is needed to simulate provisions such as vesting, comparisons of defined benefit and defined contribution types of plans, etc. The SIPP experiment entailed getting a signed release from a sample of respondents to allow the Census Bureau to query their employers. The response rate for the releases was only 44 percent, but the rate for queried employers was very high, 92 percent (Jabine, King, and Petroni, 1990:40-41). Currently, DYNASIM2 imputes pension plan variables, and PRISM matches data from a sample of pension plan sponsors.
Ideally, matched survey and administrative data would be publicly available, as in the past. If legal constraints prohibit public release, other ways should be devised to provide research access to the data. A very limiting option would be to allow analysts to be sworn in as special census employees to use the data at the Census Bureau. A preferable option might be to release the data files to the policy analysis agencies under special security provisions and allow agency analysts and their contractors to use the data at a secure facility.
Recommendation 8-2. We recommend that the Census Bureau perform a new exact match of social security earnings histories with the March CPS as soon as possible. The Census Bureau should develop a program for periodically conducting matches of social security earnings histories with both the March CPS and SIPP records. Ways should be found to make the matched data files available for research and modeling use.
In addition to good initial databases, models that are used for simulating
retirement income policies need good estimates of a wide range of behaviors to use in projecting the database forward in time. For example, DYNASIM2 simulates changes in labor force status annually as a function of previous labor force status, age, race, sex, education, region, disability status, marital status, presence of children, and spouse's earnings by using 16 profit equations (for different demographic groups) estimated from pooled 1968-1981 data from the Panel Study of Income Dynamics.
The need for the kind of research that we recommend in Chapter 6 to narrow the range of estimates of behavioral parameters—such as the propensity to move in and out of the labor force—is clearly critical to the quality of dynamic models for retirement income policies. Also critical is that the parameters be reestimated periodically, to reflect any changes in relationships that may have occurred. Alternative specifications that take advantage of improved methodology should also be investigated periodically. Although DYNASIM2 and PRISM can implement updated coefficients for parameters such as fertility rates or female labor force participation rates, it is harder to implement alternative modules with different functional forms, such as alternative labor force participation equations. Yet many of the basic routines in these models are based on research that was conducted some years ago and, hence, may not reflect methodological advances, let alone behavioral changes, since then. Greater flexibility in this regard would also greatly enhance the ability to conduct sensitivity analyses of model components.
Of course, up-to-date and better-fitting estimates do not solve the problem that relationships may change in unforeseen ways during the course of a projection period.30 Detailed sensitivity analysis involving a number of socioeconomic and demographic assumptions for each policy alternative may be the best way to indicate the likely uncertainty due to a range of plausible scenarios.31 However, not having updated parameters available when it is known, or believed highly likely, that important behavioral relationships have altered—due to some combination of policy and population changes—places an extra burden on a model. As just one example, the retirement probabilities incorporated in the DYNASIM2 model were estimated by Burkhauser and Quinn (1981) from the Retirement History Survey for people who were working at the beginning of the survey in 1969. To the extent that program and population changes have altered the labor force distribution of people of this age subsequent to 1969,
the transition probabilities estimated by Burkhauser and Quinn may no longer apply (Burtless, 1989).32
Academic research of the type that can best support dynamic microsimulation modeling of retirement income and other policies depends critically on the availability of regularly updated information from panel surveys that provide repeated measurements on the same individuals over time. It is important that the federal statistical and policy agencies support panel surveys, which represent the most appropriate mechanism to provide the information needed to study individual responses to changing events and to calculate transition probabilities.33 We believe that the costs of continued investment in panel surveys will be more than repaid by the benefits from improved understanding of behavior. In turn, a better research knowledge base for modeling and other types of policy analysis should make it possible to improve the information available for public policy debate in many critical areas.
Our review of the use of microsimulation techniques for public policy analysis has focused thus far on assistance programs, that is, the expenditure side of the government ledger. However, the earliest, most continuous, and perhaps most widespread use of microsimulation modeling has been in the area of tax policy analysis, the revenue side of the ledger. The complexities of the federal individual income tax code, coupled with the diverse economic circumstances of the U.S. population, virtually require that tax policy analysis be conducted at the micro level.34
Microsimulation Modeling for Tax Policies
Historically, there has been a need to apply microsimulation techniques to calculate the revenue effects of proposed changes in tax policies, and to answer questions about the fairness of tax policies for different population subgroups. In the 1980s, the constraints of the growing federal budget deficit and the Gramm-Rudman-Hollings Act increased the demand for estimates of the impact of changes to the tax code. Typically, the only way to produce the needed revenue
and distributional estimates is by means of microsimulation models applied to detailed microlevel databases. Most recently, the debate that engulfed the Congress and the President in considering the fiscal 1991 federal budget—the debate over which kinds of taxes to raise (or lower) and by how much, and which income groups ought to bear the tax burden—required microsimulation modelers on the staff of the Joint Committee on Taxation to work virtually around the clock for days on end (Russakoff, 1990).
The tax policy microsimulation models that are in use today have a long history, beginning in work conducted in the early 1960s by researchers at the Brookings Institution. Joseph Pechman and his colleagues at Brookings, with the encouragement of analysts in the Treasury Department, developed microlevel databases using tax return information. Subsequently, they developed even more elaborate databases for tax policy modeling called MERGE files. The first MERGE file contained statistically matched records from the Statistics of Income (SOI) public-use sample of individual tax returns and the 1967 Survey of Economic Opportunity. Subsequent MERGE files matched the March CPS with tax return records (see Pechman, 1965, 1985; see also Minarik, 1980).
The Treasury Department in the mid-1960s brought tax modeling capability in-house, and the Office of Tax Analysis (OTA) still maintains a comprehensive tax policy microsimulation model that is based on a statistical match of the detailed SOI sample with the March CPS. The full OTA model, including both SOI and March CPS data, is updated every 2 or 3 years; OTA analysts also use more recent SOI data for tabulations and ad hoc simulations. OTA is currently running version 11 of the model, which includes data from the 1985 SOI and the 1986 March CPS (Cilke and Wyscarver, 1990). The next full OTA model, which will use the 1989 SOI and the March 1990 CPS, is scheduled for completion in 1992.
The Joint Committee on Taxation (JCT), which is responsible for preparing revenue estimates of all congressionally proposed tax law changes, initially relied on analyses conducted on its behalf by OTA. Subsequently, the JCT staff ran the OTA model on the Treasury Department's computer. In the mid-1970s, the JCT brought the OTA model in-house and, since then, has modified some of the aging and imputation routines.
Many other public and private agencies maintain tax policy microsimulation models, including more than half the states; a number of private research organizations concerned with the effects of federal and state taxation policies, such as the Brookings Institution and the National Bureau of Economic Research; several private economic consulting firms; and the governments of most Western nations. And, in addition to OTA and JCT, other U.S. agencies with tax models include:
the Congressional Budget Office, which has conducted several studies of the effects of tax policies on the distribution of family and personal income using microsimulation techniques (see, e.g., Kasten and Sammartino, 1990);
ASPE, which invested substantial resources in the early and mid-1980s to enhance the federal and state income tax modules in TRIM2 (see Weinberg, 1987, for an analysis of the distributional effects of the 1986 Tax Reform Act based on running the TRIM2 model) and also recently made extensive use of the TRIM2 federal tax module to simulate proposed changes to the child care tax credit; and
the Census Bureau, which uses its model to estimate the distribution of after-tax income rather than for policy analysis purposes (see Bureau of the Census, 1988).
Although U.S. tax policy models share features in common with income support models, they also have several distinguishing characteristics. Currently used tax policy models fall into the class of cross-sectional microsimulation models that simulate policy effects on a population database at a given time and use static aging techniques to project the database forward in time.35 In general, the models are very elaborate tax liability calculators that incorporate very few explicit behavioral responses. Most models assume that taxpayers will choose whether or not to itemize deductions on the basis of which alternative reduces their tax liability, but other types of behavioral response, to the extent that they are considered, are usually handled outside the model.36
One distinguishing feature of the tax models is their very heavy reliance on information from administrative records, specifically, from the SOI samples of tax returns. Each year, the Statistics of Income Division of the Internal Revenue Service (IRS) obtains a random, stratified sample of individual income tax returns, which it processes to produce published statistical series and analyses (see Coleman, 1988). (The division also provides tabulations and analyses of corporate tax returns, partnership returns, and estate and gift tax returns, as well as other specialized tabulations based on IRS records.) In odd years, the division obtains a much larger sample of individual returns and also keys in additional variables. It is these biennial detailed files that are most heavily used in tax modeling and tax policy research.
Tax models are also distinguished from income support models, at least in degree, if not in kind, by their very heavy reliance on imputations, statistical
matching, and complex static aging techniques. More extensive use of imputation and aging techniques is needed for tax modeling because the SOI data generally become available with a lag of 2-3 years and lack information, such as family characteristics, that is important for evaluating alternative tax proposals (see further discussion, below). The March CPS provides some of the missing information and is available on a more timely basis, but it is not suited to stand on its own as a tax policy analysis database. It lacks information on deductions, some components of income (such as capital gains), and other variables (such as net operating loss carryovers) that are critical to calculating tax liability. The March CPS also "topcodes" income amounts for high-income people (the top income category shown is $100,000 or more) and has other problems from a tax modeling perspective.
One can divide the family of U.S. tax models into at least three main categories on the basis of their principal purpose and data source. The first category includes the very detailed tax model used by the OTA and the JCT. The primary purpose of this model is to prepare revenue estimates for proposed legislative changes to the federal individual income tax. The model is also frequently used to examine the distributional effects of tax law changes. The OTA/JCT model is built on information from the detailed SOI files that have not been modified for public use, with additional variables imputed or matched from the March CPS and other sources.
The second category includes tax models that start with the public-use SOI files and impute or match additional information from household surveys such as the March CPS. The primary purpose of these models is to examine broad issues surrounding the distributional effects of government tax policies. The models maintained by the Brookings Institution and the National Bureau of Economic Research are in this category.
The third category includes tax simulation routines that are embedded in models of income support programs and that, consequently, start with a March CPS database to which information from the public-use SOI files is imputed. Models such as TRIM2 fall into this category. The tax routines of this class of models are also typically used to study broad distributional effects of tax policy changes, sometimes in conjunction with the transfer program routines.
Because OTA and JCT have a need for, and also have access to, the detailed SOI files, the OTA/JCT model is more elaborate than any of the other tax policy simulation models. With regard to the differences between the detailed and public-use SOI files (see Statistics of Income Division, no date), the public-use files are processed so that names, addresses, and social security numbers are deleted from all records; records sampled at 100 percent from the full IRS file of tax returns (which are returns with very high incomes or very high losses) are subsampled at a 33 percent rate in the public-use SOI files; state codes and other geographic indicators are deleted from records with adjusted gross income of $200,000 or more and, in addition, other codes (such
as exemptions and alimony paid or received) are removed or modified for these records; wage and salary income, state and local income tax deductions, and real estate tax deductions are blurred for high-income records; state and local tax deductions and alimony paid or received are blurred for all other records; and many detailed items reported by small sets of taxpayers are deleted (e.g., preferences under the minimum tax).
Developing a Tax Model Database: The Case of OTA
To illustrate the array of problems and the extensive procedures involved in developing a suitable database for tax modeling, particularly at the level of detail required for estimating federal revenues, we describe the process of generating the database for version 11 of the OTA model (see Cilke and Wyscarver, 1990). This version of the model implemented several major improvements over earlier versions, including new aging routines. Converting the model from a UNIVAC mainframe to a VAX minicomputer provided greater computational power and speed that made possible enhancements to the model's capabilities.
To develop version 11, OTA began with the 1985 SOI sample, containing about 121,000 individual income tax returns, each of which had more than 1,000 data fields. OTA deleted unnecessary fields, created necessary recodes, packed the data to minimize computer storage requirements, and verified that the taxpayer's liability calculated by using the OTA model agreed with the liability reported in the SOI file. For the 5 percent of returns for which the reported and simulated tax liabilities did not agree, OTA examined the records and edited one or more variables as needed.
OTA next imputed additional variables. For example, because not all taxpayers itemize deductions, but some proposed changes in tax law might induce some current nonitemizers to itemize, OTA imputes deductible expenses for nonitemizing taxpayers. (Such expenses include charitable contributions, real estate taxes, home mortgage and other categories of interest, medical expenditures, and others.) Imputations were performed in some cases from regression equations estimated from the Survey of Consumer Finances and, in other cases, from probabilities estimated for itemizers in the SOI file itself.
As another example, OTA imputed the earnings attributable to husbands and wives, for married couples filing jointly, based on tax return information provided in support of the two-earner deduction for working couples. Such information is needed to assess tax provisions that impose a so-called marriage penalty and other analyses, but it was not available for many working couples except for the 1982-1986 period when the two-earner deduction was in effect. OTA then exactly matched a file containing each taxpayer's month and year of birth (from social security records) with the SOI tax return sample.
The next step was to reduce the size of the SOI sample in order to improve processing efficiency. This procedure involved an optimization model designed
to select about 89,000 observations from the full file of 121,000 tax returns in such a manner as to minimize the total loss of information when measured by an explicit loss function.
Yet (Cilke and Wyscarver, 1990:2-8),
In spite of vast improvements…to the individual tax model over the years, one shortcoming still remains: the data base relies almost entirely on tax return data. Analyzing proposals that could radically alter the tax base, the tax unit, and the tax rates requires information that is not tied to a particular tax law or limited to what is reported on tax returns filed under that tax regime.
Additional information that OTA needed includes: (1) sources of income and types of expenditures not subject to taxation under current law; (2) links between taxpayers and family or household units, so that examination of tax burdens can be based on logical economic units; (3) information on people whose incomes are too low to require filing a return under current law but who might, under some proposals such as a consumption tax scheme, become tax filers; and (4) information necessary to construct a comprehensive income measure, such as consumption plus change in net worth, that permits an analysis of the distributional effects of tax reform proposals for a broader concept than that of taxable income.
OTA's approach to obtaining much of the additional information needed for version 11 of its tax model was to carry out a statistical match of the March 1986 CPS and SOI records. The first step in this process was to apply tax filing unit provisions to each CPS person aged 16 and over in order to convert CPS households into one or more potential tax filing units or potential nonfiling units with low incomes. Corrections for underreporting of certain forms of income in the March CPS were also made by using data from the National Income and Product Accounts and other sources. The SOI file of taxpayers age 16 and older and the March CPS file of potential filers were then statistically matched on the basis of a common set of core variables, including marital status, age, number of children, housing tenure (owner or renter), and the presence or absence and amount of various sources of income. The CPS nonfilers were then appended to the matched file, and the latter file was sorted by CPS family number. Finally, taxpayers under age 16 from the SOI file were appended to families judged to have appropriate income and demographic characteristics. The resulting file size expanded to about 213,000 records, given the addition of nonfilers from the March CPS, taxpayers under age 16 from the SOI, and the need to match some CPS records to more than one SOI record and vice versa in order to preserve the properties of each of the two samples. In particular, the March CPS sample is very thin in the number of cases of high-income tax filing units in comparison with the SOI.
The last stage in the process of creating the OTA tax model database
involved several series of imputations. One series was undertaken to provide information needed to simulate the 1986 Tax Reform Act (TRA). Imputations required for this purpose included tax-exempt interest, health insurance expenses for self-employed people, meal and entertainment expenses, adjustments to pension income, contributions to 401(k) retirement plans, net operating losses carried over to future tax years, employer contributions to pension plans, and active and passive income by source. A variety of data sources were used to develop the TRA imputations. A new set of parameters to simulate the TRA provisions also had to be incorporated into the tax calculator portion of the model.
Another set of imputations was performed to generate information needed to simulate tax liabilities for catastrophic health insurance coverage under Medicare. These imputations illustrate the additional database enhancement that is frequently required to model new provisions of the tax code. Required imputations in this instance included Medicare eligibility for the tax unit head and spouse, employer contributions to Medigap insurance, reduced medical expense deductions because of provided catastrophic health insurance, pension income by source for the unit head and spouse, and social security income for the unit head and spouse. Repeal of the catastrophic health insurance program in 1989 obviated the need for such information at the present time, although Congress may in the future consider a similar type of plan.
Finally, a series of imputations was undertaken to provide information to construct the broad concept of family economic income used by OTA in analyzing the distributional effects of proposed tax law changes on income classes of the population. For example, untaxed income sources (such as AFDC and SSI), employer-provided benefits, and exclusions from taxable income (such as 401(k) contributions) had to be imputed and added to adjusted gross income for filers and added to estimated adjusted gross income for nonfilers.
The result of all of these operations was a greatly enhanced database for modeling tax policies as of 1985. However, because OTA must develop projections for the current year and a minimum of 5 years into the future, it has developed and regularly employs static aging procedures. The aging procedure for version 11 of the OTA model, which reflects a major effort to improve this part of the model, generates a new database for each year from 1986 through 1995 and is implemented in two stages (see Gillette, 1989).37 The procedure is performed each time a simulation is carried out and is under the control of the user, who can select one of three aging alternatives developed by OTA to represent different expected rates of economic growth.
The first stage in the aging process applies growth factors on each dollar amount in the database to reflect actual and projected per capita real growth
and inflation. The second stage adjusts the weights of each family head in the file to hit aggregate targets for 33 different variables, such as adjusted gross income by income class and type of return, for which the targets are chosen to be consistent with forecasts of national income, population, and inflation. The weight adjustment is carried out in a series of iterations designed to hit the targets while minimizing a loss function.
Finally, OTA has included in the aging procedures an optional special adjustment to simulate the behavioral response of taxpayers to the increase in the top tax rate on long-term capital gains resulting from the 1986 Tax Reform Act. The simulation changes the level of realizations of long-term capital gains as a function of the difference in marginal tax rates before and after tax reform and the prereform level of realized gains. Other behavioral responses to increased tax rates on capital gains, such as changing holding periods, shifting mixes of capital assets, and converting ordinary income to (or from) capital gains, are generally estimated outside the model.
Issues in Modeling Tax Policies
Clearly, the use of SOI tax return information in tax policy microsimulation models is beneficial and, indeed, essential for detailed simulation of the revenue consequences of proposed changes to the tax code. The SOI files provide documented sets of income amounts, deductions, and other tax-related variables for large samples of actual filing units and thereby portray all of the detail of the current tax code. Just as clearly, however, the SOI files are inadequate in many ways. They lack variables necessary to characterize fully the socioeconomic status of households and families for use in analyses of the distributional effects of tax policy changes. They also lack variables needed to simulate proposed tax reforms that broaden the base of taxable income, expand the types of allowable deductions and credits, change the definition of a tax filing unit, or in other ways significantly alter the tax code. Users of the SOI data alone for tax policy modeling are at the mercy of changes in the tax laws that not only may require new imputations but also add, or delete, useful information. For example, the information required to support two-earner deductions for working couples from 1982 through 1986 provided a good basis for determining the share of wages attributable to each spouse; however, this information is no longer available.38
To obtain the range of needed variables, tax policy modelers must confront a heavy task of extensive data creation using imputation and matching techniques. Imputations for missing variables are also required for simulations of current and proposed income support programs, but not to the degree that is usually required for detailed simulation of proposed changes to the tax code.
A factor that complicates the task of creating a tax policy simulation database is that tax return data differ in important ways from the household survey data in the March CPS and other files that are often used in matching and imputation. For example, there is a difference between tax filing units and households or families. Data on filing units are required to estimate tax liabilities, but data on households are most often required for distributional analyses. Recent changes in the tax law that require social security numbers for most dependents should make it easier to reconstruct family units from tax returns when dependents file separately and, thereby, improve the match with household survey records. However, the tax return data cannot, by definition, provide information on families that are not currently required to file tax returns. Survey data are needed to supply information about nonfilers, and one or more potential tax filing units must be constructed for these cases.
Another important difference that affects the quality of matches of household survey data with tax return files is the imbalance in the distribution of records by income class in the two samples. Household surveys such as the March CPS contain many more low-income and many fewer high-income records in comparison with the SOI data. Moreover, survey data for high-income people are typically top coded to protect confidentiality. These differences constrain the ability of the matching algorithm to find good matches.
Still another important difference between SOI and household survey data concerns underreporting (and misreporting) of income. Underreporting affects both types of data, but the opportunities to underreport (or misreport) differ for different types of income and, thus, differentially affect tax returns and household surveys. For example, W-2 forms must support reports of wages and salaries to the IRS, but this is not the case for household surveys.39 The growing use of 1099 forms by providers of interest, dividends, and other types of nonwage income presumably encourages more complete reporting to the IRS, while these kinds of income sources are of notoriously poor quality in surveys. However, reports of income from odd jobs and cash transactions may be better, or at least no worse, in surveys than on tax returns.
As noted above, OTA, as part of creating its tax model database, adjusts the March CPS income amounts to national targets. One could also use tax compliance data to correct the SOI figures for the entire population of filers. However, if one wants to model income tax collections, one should only implement corrections for the small number of people who are actually audited. In any case, the differences in mechanisms underlying income reporting in administrative and survey data bear examination to determine their effects on the quality of the resulting merged database.
One of the original goals of SIPP was to support improved modeling of tax as well as transfer policies. Each SIPP panel to date has annually included a module that obtains information about tax-related variables, including yearly income amounts from employment and assets, type of tax return filed, number and type of exemptions, type and amount of deductions and credits, and taxes paid. Clearly, such information, together with the improved reporting of income in SIPP, would be valuable for tax policy models that currently start with the March CPS and for models, such as OTA's, that match CPS with SOI data. However, small sample sizes and confidentiality restrictions have severely limited the utility of the SIPP tax information to date. Only a few variables from the tax module are included in files that are available for research and policy analysis use (no information on amounts is made available), and these files are obtainable only under special access arrangements.
One way to assess and improve the quality of tax modeling databases would be to conduct exact matches, based on social security numbers, of household survey records from the March CPS or SIPP with tax return data from the full IRS files. Confidentiality restrictions may well preclude access to such a matched file, but the Statistics of Income Division and the Census Bureau could explore the possibility of conducting exact matches for use solely in evaluating and enhancing the quality of the statistical matches and imputations that are currently done to construct tax policy analysis databases.
Confidentiality concerns generally are a major and growing problem for access to tax return data for research and analysis purposes. More and more data on the public-use versions of the SOI files are being blurred. For example, deductions for state taxes are blurred because some states disclose information such as the amount of property taxes paid. (No state codes are provided at all for returns with adjusted gross income of $200,000 or more.)
Even if access problems are solved, tax modeling still requires substantial data generation. For example, to support comprehensive modeling of the incidence of taxes, one needs to impute consumption so that one can model excise and sales taxes. One also needs to impute all data for nonfilers. These are big imputation jobs. Much more attention needs to be paid to imputation procedures and their quality, particularly for models that try to assess tax incidence broadly. More sensitivity analysis of imputation techniques is needed. OTA has commissioned some work on this topic (see, e.g., Kadane, 1978), but more is needed, for example, on the impact of imputations on joint distributions of variables.
The need for updating and projecting the information required for tax policy modeling further increases the difficulties of data generation and adds uncertainties, which could be substantial, to the estimates. Just as imputation procedures generally need much more extensive evaluation, so do the procedures used by tax models to age their databases.
Finally, the issue of behavioral response is serious for tax models because
behavioral effects of tax policy changes affect almost the entire population. One is not dealing with a subset of the population, such as welfare recipients, for which behavioral effects may be relatively unimportant. Changes in marginal tax rates undoubtedly affect behavior across the population and also have important second-round effects. To model behavioral responses in a satisfactory manner, one would need to link tax models with other models of wage rate and factor price determination. Data from panel surveys that follow people over time are also needed for development of good estimates of behavioral responses to such changes as the tax treatment of capital gains. Another important area of research concerns ways to narrow the range of parameter estimates and address other technical considerations in developing and implementing behavioral response functions in microsimulation models. An added issue to consider concerns presentation of estimates of first-round and second-round effects of tax policy changes. One can make a real case for using models for short-term estimates and then discussing possible longer-run behavioral responses in qualitative terms. This was the approach followed by the Treasury Department in the 1984-1985 tax reform debate leading to the 1986 Tax Reform Act.
Given that our review concentrated principally on models of government expenditure programs in the social welfare area, we do not offer explicit recommendations for tax policy models. However, we believe that there are at least three priority needs for attention. One important need is to address confidentiality and data access issues for tax return information and, specifically, to develop ways to take advantage of exact matches of survey and administrative records information to improve the quality of tax model databases. The second important need is to improve the capability for estimating the effects of behavioral responses to tax policy changes, whether these estimates are developed inside or outside the tax policy simulation models themselves.
The third, and perhaps the most important, need is to conduct sensitivity analyses and other validation studies of the extensive array of imputations and matches that are performed in creating databases for tax modeling and of the procedures used to age these databases. We are concerned that the imputation and aging procedures, regardless of how much care is taken in implementing them, may distort important covariances for individual filing units. Under the current methodology, each imputation is performed individually to meet aggregate control totals and other known information about the distribution of the item by income class. When several imputations are done in this fashion, however, the effect may be to cause variations among individual units within income classes that reflect the imputations more than the real world. The imputation procedures may not have adverse effects on the quality of comparisons involving vertical equity, that is, assessment of the effects of tax policy changes on different income classes; however, they may well adversely affect the quality of comparisons involving horizontal equity, that is,
assessment of the impact of tax policy changes on taxpayers in similar economic circumstances.
These needs are important not only for microsimulation models of tax policies, but also for models in other policy areas. We encourage a broad range of policy analysis agencies to work together to make the needed improvements in microsimulation models.