8
Microsimulation Modeling of Health Care, Retirement Income, and Tax Policies

Our discussion to this point has focused largely on the problems of modeling income support programs such as AFDC and food stamps. Microsimulation techniques are also appropriate and have been used extensively in other social welfare policy areas. Many of the issues that we raise with regard to data inputs, model design, and computing technology are common across policy areas, although each area presents some unique features and problems. In this chapter we briefly discuss special issues in microsimulation modeling of health care, retirement income, and tax policies. Because we did not review the models and data in these areas in as great depth as those in the income support area we make few specific recommendations; instead, we do raise issues that we believe are particularly important to address.

One general question that arises is the relative weight to give to investments in microsimulation models for different policy areas. Because we do not pretend to have any particular expertise in foreseeing the future mix of policy issues, we cannot offer unequivocal advice on this question. We note, however, that health care policy is an area of growing importance because of the escalating costs of providing health services and the evidence of glaring gaps in the health care system, such as the large population not covered by private or public health insurance. Moreover, as we indicate below, available data, research, and models for health care policy analysis exhibit many inadequacies relative to the information needs.

Yet we believe it would be unwise to concentrate investment resources on any one set of issues. Welfare policy provides a cautionary example in this regard. After the collapse of the Carter administration's push for the Better Jobs



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations 8 Microsimulation Modeling of Health Care, Retirement Income, and Tax Policies Our discussion to this point has focused largely on the problems of modeling income support programs such as AFDC and food stamps. Microsimulation techniques are also appropriate and have been used extensively in other social welfare policy areas. Many of the issues that we raise with regard to data inputs, model design, and computing technology are common across policy areas, although each area presents some unique features and problems. In this chapter we briefly discuss special issues in microsimulation modeling of health care, retirement income, and tax policies. Because we did not review the models and data in these areas in as great depth as those in the income support area we make few specific recommendations; instead, we do raise issues that we believe are particularly important to address. One general question that arises is the relative weight to give to investments in microsimulation models for different policy areas. Because we do not pretend to have any particular expertise in foreseeing the future mix of policy issues, we cannot offer unequivocal advice on this question. We note, however, that health care policy is an area of growing importance because of the escalating costs of providing health services and the evidence of glaring gaps in the health care system, such as the large population not covered by private or public health insurance. Moreover, as we indicate below, available data, research, and models for health care policy analysis exhibit many inadequacies relative to the information needs. Yet we believe it would be unwise to concentrate investment resources on any one set of issues. Welfare policy provides a cautionary example in this regard. After the collapse of the Carter administration's push for the Better Jobs

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations and Income Program in 1977-1978 and the subsequent focus of the Reagan administration on restricting welfare benefits, one might well have concluded that capabilities for modeling significant welfare reform initiatives would be of little importance. That conclusion would have been wrong, as evidenced in our review of the policy debate that led to the Family Support Act (FSA) of 1988 (see Chapter 2). Indeed, failure to invest in improvements to models and data handicapped the ability of policy analysts to use microsimulation techniques to develop estimates for many of the proposed welfare policy changes considered in the FSA debate. We do not expect that the FSA represents the last word in welfare policy, either. Hence, we see a continued need to scrutinize income support models and data and to determine ways in which they can be improved. Similarly, we see the need to scrutinize models and data for retirement income and tax policies, as well as for health care policies. Indeed, the problem is not so much to pick policy areas for investment—all of them are and will continue to be important—but to discern particular aspects of each broad area that are likely to assume priority in the policy debate. For example, the need in the FSA debate was for models that could link the AFDC program with new initiatives such as child support enforcement, job training, and transitional assistance programs, tasks for which the existing models and data were not well suited. Although there are many sources of information that can help agencies anticipate future policy proposals, there is no crystal ball that will furnish them with infallible forecasts for guiding their investments in policy analysis tools. The difficulties of predicting the policy agenda underscore the importance of investments that are aimed at improving the overall capabilities of microsimulation models (and other policy analysis tools) for flexible, timely, and cost-effective responses to changing policy concerns. To achieve this goal, whether for income support, health care, or any other policy area, databases need to be broad in scope, models need to follow good design principles and practices, and agencies need to find ways to further fruitful interactions between policy research and modeling. HEALTH CARE POLICIES Some of the reasons that health care policy issues are of continuing and increasing concern to decision makers are evident from the following selected indicators: Total public and private spending for health care in the United States, which currently amounts to more than $600 billion, increased from 7.3 percent of the gross national product in 1970 to 11.6 percent in 1989. Over this same period, the proportionate share of national health care costs assumed by the public sector increased from 33 to 40 percent; the costs of the federal Medicare program rose by 300 percent (in real terms) to $102 billion, and the costs of

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations federal-state public assistance programs for health care (chiefly, Medicaid) rose by 215 percent (in real terms) to $67 billion.1 While the consumer price index (CPI) increased overall by 200 percent from 1970 to 1988, the medical care component of the CPI increased by 300 percent over the same period (Bureau of the Census, 1990b:Table 762). In the fourth quarter of 1988, an estimated 31.5 million people, or 13 percent of the total population, were not covered by health insurance of any kind, either private insurance, Medicare, or Medicaid (Nelson and Short, 1990:3). Over the period 1984-1989, total diagnosed AIDS cases increased from 4,000 to 78,000; federal spending for AIDS rose from $0.06 to $1.3 billion; and state spending for AIDS rose from $0.01 to $0.2 billion (Bureau of the Census, 1990b:Table 188). A recent microsimulation study projected that, over the next 30 years (1986-1990 to 2016-2020), the elderly (people age 65 and older) will increase from 31 to 50 million; those elderly receiving long-term nursing home or home care services will increase from 6.3 to 10.4 percent of the total; total public and private expenditures for nursing home services will increase by 197 percent, to $98 billion; and total expenditures for home health care services will increase by 154 percent, to $22 billion (Rivlin and Wiener, 1988:10-11) (all dollar amounts in constant 1987 dollars). These indicators and others underscore the policy interest in the health care area.2One pressing set of issues revolves around how to manage and contain what appear to be runaway costs for medical services. These costs are driven by many factors—ranging from the development of expensive new medical technology and treatments, to demographic and socioeconomic changes in the population, to the demand incentives for medical care that result from public and private health insurance programs. Another equally pressing set of issues revolves around how to ensure that people who need health care services will have access to them at a reasonable cost. Before proceeding, we should make clear that our discussion of health-related policy modeling is limited to cost and coverage issues pertaining to the provision of health care services. Another topic that we did not take up but that deserves serious consideration is the use of policy analysis tools for modeling the relationships of health care interventions and other determinants of health status to health outcomes in the population and for estimating the cost-effectiveness 1   All figures are from Bureau of the Census (1991 :Table 136). Total national health care expenditures include costs of medical research and construction of medical facilities, in addition to direct health care costs. Percentage increases in real terms were determined by using the GNP implicit price deflator for government purchases of goods and services (calculated from Bureau of the Census, 1991 :Table 767). 2   The discussion in this section benefited greatly from papers prepared for the panel by Chollet (1990), who reviewed some of the existing health care policy models, and Grannemann (1989), who reviewed issues in modeling behavioral responses to health care policy changes.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations of alternative treatments and the feedback effects on overall costs of the health care system. It is clear that expenditures on health care may not always translate into improvements in health. (Thus, although the United States spends a higher proportion of its GNP on health care than other industrialized countries, it ranks lower than many countries in such health indicators as infant mortality and life expectancy; see Bureau of the Census, 1990b:Tables 1440,1444.) Hence, it is important in assessing alternative health care policies to consider not only the costs in terms of payments for services, but also the costs and benefits in terms of the effects on health outcomes. This topic presents challenging modeling issues that we have not investigated, but it seems likely that microsimulation techniques, with their ability to model complex relationships and individual circumstances, could make potentially useful contributions. We note in this regard that demographers have recently developed complex stochastic models of disease processes and disability states, using hazard techniques with longitudinal data such as the Framingham Heart Study and the 1982-1984 National Long-Term Care Survey (see, e.g., Manton and Stallard, 1990; Manton, Woodbury, and Stallard, 1990, 1991). (The National Institute on Aging has supported much of this work.) They have used the resulting equations to analyze a number of policy-relevant issues. For example, they have analyzed the implications for the age structure of the population and total versus active life expectancy (i.e., years free of disability) of alternative assumptions about the elimination of particular diseases or risk factors (e.g., smoking) in the population. They have also examined the savings in nursing home and home health care costs that might be achieved by eliminating such diseases as Alzheimer's. To date, researchers working in this field have largely applied their estimated parameters to cell-based models, such as life tables, to analyze alternative scenarios. It may well be that putting this type of risk-factor analysis in a microsimulation framework and, further, effecting a linkage with microsimulation models of health care financing and coverage issues could have potential payoffs for analysis of health policy issues.3 Microsimulation Modeling for Health Care Policy Microsimulation has played a role in analysis of health care cost and coverage issues since the technique was first introduced to the political process. The RIM model was used in the late 1960s to estimate the costs and distributional effects 3   Wolfson (1989b, 1991), writing from a Canadian perspective, asserts that the health statistics system has given too much attention, relatively, to information about inputs to the health care system, including financial costs, and too little attention to ''the supposed outputs of the system, namely the health status of the Canadian population." He argues for microsimulation models that would estimate the impacts of alternative health care policies and programs on health status as well as models of costs of and access to medical services. At present, Statistics Canada is developing a Population Health Model that includes microsimulation components for disease processes and health care interventions.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations of alternative national health insurance programs (Orcutt et al., 1980:85). A Medicaid module was developed for the MATH model in the late 1970s (Pappas, 1980). A microlevel database of households from the 1976 Survey of Income and Education and the 1976 Survey of Institutionalized Persons formed the population component of the ASPE Health Financing Model that was used to estimate alternative national health insurance programs during the Carter administration (Office of the Assistant Secretary for Planning and Evaluation, 1981). However, throughout the 1970s and early 1980s, microsimulation techniques were applied less frequently to health care policy analysis than they were to other social welfare policy areas. None of the microsimulation models developed for health care policy gained the kind of widespread use enjoyed by models such as MATH and TRIM2 for analysis of tax and transfer programs or models such as DYNASIM2 and PRISM for analysis of public and private pension systems. Recently, the use of microsimulation techniques for health care policy analysis has shown signs of new life. A module in TRIM2 to simulate the costs and distributional effects of expanding Medicaid coverage was recently redesigned and updated (see Holahan and Zedlewski, 1989); in addition, work is under way, sponsored by the Department of Labor, to add capabilities to TRIM2 to model employer-provided health insurance benefits. A major expansion of the PRISM model was effected to simulate alternative financing programs for long-term care of the elderly (Kennell and Sheils, 1986; Kennell et al., 1988; Rivlin and Wiener, 1988). DYNASIM2 has also been used to look at longterm care issues. The developers of PRISM recently built the Health Benefits Simulation Model, a comprehensive model for the household sector designed to simulate health insurance coverage, health services use, total health care spending, and sources of payment among the noninstitutionalized population (see Chollet, 1990). CBO has developed a microsimulation model for simulating changes in Medicare benefits, based largely on Medicare administrative records, that was used to estimate the costs and distributional effects of alternative ways to insure against catastrophic health care costs under the Medicare program (Congressional Budget Office, 1988).4 CBO has also developed models of Medicare and Medicaid eligibility. The Health Care Financing Administration (HCFA) has sponsored work by the Actuarial Research Corporation to apply static aging techniques to update the 1977 National Medical Care Expenditure Survey (NMCES) and to use the resulting database to simulate policy issues, 4   Catastrophic coverage for Medicare beneficiaries was enacted in 1988 but repealed the next year. Decision makers in Congress had underestimated the antipathy of relatively well-off elderly people toward paying income tax surcharges to finance the program. In addition, revised CBO estimates of costs to the government for reimbursing catastrophic prescription drug charges, produced from its Medicare microsimulation model by using newly available data on growth in prescription drug usage, were more than double the original estimates (see further discussion in text).

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations such as the effects of extending Medicaid coverage. (Wilensky, 1982, originally proposed basing a comprehensive microsimulation model for health care issues on the 1977 NMCES.) All of these health care policy microsimulation modeling efforts have dealt with the household sector and primarily with issues of expanding health insurance coverage and the associated costs to the federal and state governments for reimbursing medical care charges. Microsimulation-based models have also been developed to examine issues that affect the supply side of the health care market. For example, CBO developed a model, based on Medicare administrative records of payments to physicians, augmented with county-level data, that has been used to estimate the effects of changing fee schedules for different types of medical specialties and geographic areas (Congressional Budget Office, 1990). (Legislation to alter fee schedules, which is to take effect in 1992, was recently passed.) CBO has also developed a model, based on data for each of the hospitals in the United States, that is designed to estimate the cost and distributional effects of changing various provisions of the Medicare prospective payment scheme used to reimburse hospital costs. For example, the CBO model could determine effects by geographic area, type of ownership of the hospital (private, nonprofit, etc.), and hospital size (number of beds). The Health Care Financing Administration has a similar prospective payment scheme model. In looking back over the past 20 years, however, it is clear that microsimulation models have played a distinctly subordinate role in health care policy analysis. Moreover, many of the microsimulation models that exist were developed on an ad hoc basis for special purposes and are neither well documented nor used outside the agency that developed them.5 Cell-based models, often with links to macroeconomic models, have played a much more prominent role in health care policy analysis. The Health Resources Administration supported development of a cell-based health care sector simulation model in the 1970s (Yett et al., 1980). The model, which never went beyond the prototype stage, represented an ambitious effort to relate demand and supply relationships in the health care market. It included submodels for projecting the population of consumers, classified by demographic and income categories; the supply of physicians, classified by age, specialty, and a few other characteristics; quantities and prices of physician services; quantities and prices of hospital services; and the supply of nonphysician health personnel. The ASPE Health Financing Model was essentially a large cell-based 5   In this regard, Chollet (1990:2) notes: "Other microsimulation efforts [in addition to CBO's in the health care area] have been mounted by the Prospective Payment Assessment Commission (ProPAC), the Physician Payment Reimbursement Commission (PPRC), the Health Care Financing Administration (HCFA), and by the Employee Benefit Research Institute (EBRI). Like CBO's microsimulation models, each of these efforts was undertaken for specific, in-house analyses; none [is] documented for external use."

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations model. The microdatabase, developed from merging the Survey of Income and Education and the Survey of Institutionalized Persons, was projected forward in time by using static aging procedures and then aggregated into several thousand population cells, classified by family income; employment of adult members of the family; primary insurance coverage; family size; and age, sex, and disability status of the person. Data on health care utilization from the National Health Interview Survey were matched to the population data on a cell-by-cell basis. Finally, data on health care expenditures by type of medical service, source of payment, and population group were assembled from a variety of administrative sources and, in turn, matched with the population cells to generate the medical expenditure profile in the simulation year under current law. Simulations of alternative profiles altered the cell values to reflect a number of direct and indirect effects of policy changes, such as changes in the level of utilization of services in response to changes in cost-sharing and other patient payment provisions. Often, the population microdata records were retabulated to support simulation of alternative policies that affected different subgroups, but the core of the model operated on a cell basis. The Macroeconomic-Demographic Model of Health Care Expenditures, developed for the National Institute on Aging by Lewin/ICF, Inc., is a large cellbased model for projecting health-care costs over the long term (Anderson, 1990; Cartwright, 1989).6 The model includes a macroeconomic growth model with two goods—investment and consumer goods—and two factor inputs—labor and capital services. Interacting with the macro model are large cell-based models of population growth, the labor market, pension benefits, family formation, consumer expenditures, housing demand, health care expenditures, and health care benefits. As an example of the model's size, health care expenditures are estimated for 3,136 family groups, classified by family size, sex of head, age of head, race of head, geographic region, urban or rural residence, and whether or not covered by private health insurance. The equations used to estimate expenditure shares, labor supply, and other parameters in the model were developed by using microdata, but the model itself operates on a cell basis. Issues in Modeling Health Care Policy Alternatives The limited application of microsimulation modeling in the health care policy area and, indeed, the failure of any particular model(s), regardless of type, to gain widespread use for health care policy analysis result from the complexity 6   The President's Commission on Pension Policy and the National Institute on Aging initiated development of this model in 1979. At that time, the model focused on the retirement income system and the interactions of social security and private pensions. After the 1983 Social Security Act Amendments, resources were concentrated on adding components to the model to simulate the impacts of population aging on the health care system.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations of the issues involved and the magnitude of the modeling task. Moreover, the complexities and broad scope of health care policy issues have led to fragmented efforts to develop needed databases and research knowledge, which, in turn, have handicapped model development efforts. Why is health care policy so daunting? For one thing, many actors are involved in health care, including: patients and prospective patients, in both the household and the institutionalized population; informal caregivers, such as relatives; doctors, hospitals, nursing homes, and other service providers; state and federal insurers and regulators; and private profit and nonprofit insurers of several types, ranging from traditional insurers that reimburse on a fee-for-service basis to health maintenance organizations and other prepaid medical service plans. The interconnections of the various actors exhibit a bewildering variety that makes it difficult to assemble relevant information or even determine an appropriate unit of analysis. For example, an elderly patient, during the course of one illness, may be treated by several different specialists in one or more hospitals or other service centers and may obtain reimbursement from both Medicare and a private Medigap policy and also pay some costs directly. For some health care policy questions, it may be important to have sufficient information to analyze the spell of illness (or another broad measure) instead of working with a narrower unit of analysis such as doctor visits or hospital days. Yet, in our example, none of the service providers or insurers is likely to have complete information about the patient or about the procedures or costs involved in treating the illness (e.g., Medicare reimburses doctors and hospitals separately). Mirroring the complexity of the health care sector, there are many federal agencies involved in health care data collection, research, and policy analysis, often with overlapping mandates and interests that do not always make it easy for coordination or progress toward a common set of goals. Agencies of the Department of Health and Human Services (HHS) with important roles in this regard include ASPE, the Health Care Financing Administration, the Agency for Health Care Policy and Research, the National Institute on Aging, and the National Center for Health Statistics, among others. Research Knowledge It is certainly possible to model a particular class of actors and consider policy questions that directly affect that class—for example, simulating the extension of Medicaid coverage to a broader population or the effects of higher coinsurance rates on Medicare beneficiaries. However, unlike the case with a program

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations such as AFDC, it is clearly perilous to ignore first-round behavioral responses to health care policy changes. For example, research has shown that people alter their demand for medical services in response to changes in coverage, coinsurance rates, and other provisions of health insurance plans that affect the price of services.7 Moreover, demand responses can involve large numbers of people and thus have large effects on program costs. In contrast, although features of income support programs can certainly affect aspects of behavior such as labor supply, research has demonstrated relatively small effects for eligible groups, and these groups represent relatively small proportions of the total population.8 It appears likely to be perilous as well to ignore second-round effects in modeling health care policy changes. For example, changes in insurance coverage may ultimately lead to changes in hospital pricing policies, such as increasing prices for services that tend to be used more heavily by insured patients (see Grannemann, 1989:8-9). As another example, changes in public health insurance benefits are likely to affect the market for private health insurance and thereby influence total and public-sector health costs in unforeseen ways.9 As still another example, it is widely acknowledged that physicians have a large influence over levels and costs of medical services because of such factors as the limited information available to patients on prices and services. Hence, it is important to consider physician behavior in evaluating alternative health care financing and payment policies. The research knowledge that could support modeling the complex health care sector has many inadequacies. There are many gaps and deficiencies in understanding the behavior of consumers and providers of health care, including evidence on which to base estimates of physician response to changes in payment schedules. There are some studies showing that the decisions of physicians to participate in Medicaid are responsive to the level of Medicaid reimbursement rates. Other studies suggest that state restrictions on Medicaid payments may reduce the number of visits to physicians in private offices but increase visits to other, more expensive, ambulatory care providers, such 7   For example, the National Health Insurance Experiment conducted by the Rand Corporation found differences in health care expenditures of up to 45 percent between groups with different coinsurance rates. The National Long-Term Care Channeling Demonstration showed greatly increased use of community-based long-term care services among impaired elderly people who received coverage for this type of care; both studies are cited in Grannemann (1989:4). 8   In an analysis of effects on employment, using data for female-headed families from the 1984 SIPP panel, Moffitt and Wolfe (1990) found weak negative effects of the AFDC guarantee, stronger negative effects of both the food stamp guarantee and an index of the expected benefits from Medicaid, and very strong positive effects of an index of the expected benefits from private health insurance. 9   Cartwright (1989:11) notes that a reduction in Medicare benefits is likely to reduce Medicare expenditures by a proportionately greater amount because of the impact on the demand for Medicaid and private Medigap insurance plans. Cartwright does not speculate about the combined effects on total health care costs.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations as hospital emergency rooms. However, the evidence for these conclusions is very limited (see Grannemann, 1989:15). In general, there is insufficient evidence to choose among models of physician behavior, for example, that physicians tend to act according to the "target-income hypothesis," whereby they alter prices or quantity of services in order to achieve a target level of income; or that physicians in metropolitan areas are in a position to raise prices more freely compared with physicians in rural areas, because of differences in consumer access to information (Grannemann: 1989:13). Yet in simulating the effects of the most recent changes in Medicare physician payment schedules, it would have been very desirable to have good estimates of the extent to which physicians would shift their activities in favor of more generously reimbursed procedures; of how their responses, in turn, would affect usage and costs under Medicare; and of what the impact would be, in the long run, on physician specialty choices.10 Similarly, there are competing models of hospital behavior—including models based on profit maximization, utility maximization, or physician control (see Grannemann, 1989:7-ll)—but the available evidence has not established the superiority of any particular model of hospital behavior. For yet another area, very little is known about the potentially great impact that current research on the effectiveness of alternative medical treatments could have on delivery of health care services. Such research may well lead public and private insurers to deny payment for services that are deemed ineffective or not worth their cost. Such actions would presumably lead to responses on the part of providers and patients that, in turn, would affect usage and costs of medical services (Grannemann, 1989:36). Databases There is no lack of information sources pertinent to health care, including many surveys and administrative records.11 However, there is no database, such as the March Income Supplement to the Current Population Survey (CPS) or the 10   The new payment system, instead of reimbursing doctors' "customary, prevailing, and reasonable" charges, assigns specific amounts to specific procedures based on assumptions about their relative value. The result of the system chosen is expected to raise receipts for internists and general practitioners and lower those for surgeons (Congressional Budget Office, 1990). The CBO analysts who produced estimates for the reimbursement changes did not attempt to include long-range responses in their microsimulation model. They did include a first-round behavioral response, with the assumption that physicians would strive to meet a target income. Specifically, they assumed that 50 percent of a physician's potential income loss due to the new Medicare fee schedule would be offset by an increase in volume of services, and 35 percent of a physician's potential gains would be offset by a decline in volume (Congressional Budget Office, 1990:Appendixes; see also Chollet, 1990). 11   See Gilford (1988: Appendix C) for descriptions of some of the major health data sets; see Panel on Statistics for an Aging Population (1986) for an inventory of data sets related to the health of the elderly.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations Survey of Income and Program Participation (SIPP), that provides, on a regular basis, the majority of the variables needed for evaluating alternative health care policies (although the CPS and SIPP, particularly the latter, do provide relevant information). This is true even in models confined to particular sectors and types of issues, such as insurance coverage of the population. Hence, although microsimulation models typically face the task of generating a suitable database by combining variables from multiple sources through techniques such as imputation or statistical matching, models for health care (including cell-based models) face an unusually daunting database construction task.12 There are several examples of the database problem. The current TRIM2 Medicaid module is designed to answer fairly narrow questions about the subgroups of the noninstitutionalized population that will benefit from extending coverage of the Medicaid program to more low-income Americans and the government budgetary implications of such an extension.13 Yet it requires the use of several data sources. The principal sources include the March CPS (supplemented with other variables, such as deductible expense imputations based on Consumer Expenditure Survey data) and the Medicaid "tape-to-tape" data from state administrative records.14 The March CPS serves as the primary database for simulating Medicaid eligibility on the basis of AFDC and SSI, while the Medicaid data provide a basis for estimating enrollment (participation) rates and medical care utilization and expenses. The ICF/Brookings Long-Term Care Financing Model (LTCFM), which is built on top of the PRISM model, is designed to answer questions about costs and coverage of alternative ways of financing long-term nursing home and home health care for the elderly. The LTCFM submodel uses the basic PRISM model, with some modifications, to project the numbers and characteristics of the elderly with regard to family structure, employment, income, assets, and 12   Tax policy modeling also requires extensive matching and imputation to develop a suitable simulation database (see below). However, tax policy analysts do have samples of administrative records from tax returns that regularly provide many of the needed variables. 13   Modeling the institutionalized would require a database on this population (which the CPS does not provide) and somewhat more extensive simulation of the "medically needy" eligibility provisions of Medicaid than the current TRIM2 module contains. Under these provisions, people who would not ordinarily be eligible on the basis of their income and assets, but who incur high medical care costs that result in their "spending down" their resources, can become eligible for Medicaid. Most medically needy eligible people are the elderly and disabled who need but cannot pay for nursing home care. The cost implications for federal and state governments are not the same as the net cost to society. Currently, the expense of covering low-income people who are not enrolled in Medicaid is being borne by patients (through out-of-pocket payments), private insurers, and providers (uncompensated care). Moreover, extended Medicaid coverage might well increase use of services and thereby increase total health care costs, as well as redistribute them among payment sources. 14   The Medicaid tape-to-tape data are commonly formatted files created by the Health Care Financing Administration from the administrative records of five participating states that have computerized Medicaid management information systems. There are separate files for Medicaid enrollees, claims, and providers.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations and distributional estimates is by means of microsimulation models applied to detailed microlevel databases. Most recently, the debate that engulfed the Congress and the President in considering the fiscal 1991 federal budget—the debate over which kinds of taxes to raise (or lower) and by how much, and which income groups ought to bear the tax burden—required microsimulation modelers on the staff of the Joint Committee on Taxation to work virtually around the clock for days on end (Russakoff, 1990). The tax policy microsimulation models that are in use today have a long history, beginning in work conducted in the early 1960s by researchers at the Brookings Institution. Joseph Pechman and his colleagues at Brookings, with the encouragement of analysts in the Treasury Department, developed microlevel databases using tax return information. Subsequently, they developed even more elaborate databases for tax policy modeling called MERGE files. The first MERGE file contained statistically matched records from the Statistics of Income (SOI) public-use sample of individual tax returns and the 1967 Survey of Economic Opportunity. Subsequent MERGE files matched the March CPS with tax return records (see Pechman, 1965, 1985; see also Minarik, 1980). The Treasury Department in the mid-1960s brought tax modeling capability in-house, and the Office of Tax Analysis (OTA) still maintains a comprehensive tax policy microsimulation model that is based on a statistical match of the detailed SOI sample with the March CPS. The full OTA model, including both SOI and March CPS data, is updated every 2 or 3 years; OTA analysts also use more recent SOI data for tabulations and ad hoc simulations. OTA is currently running version 11 of the model, which includes data from the 1985 SOI and the 1986 March CPS (Cilke and Wyscarver, 1990). The next full OTA model, which will use the 1989 SOI and the March 1990 CPS, is scheduled for completion in 1992. The Joint Committee on Taxation (JCT), which is responsible for preparing revenue estimates of all congressionally proposed tax law changes, initially relied on analyses conducted on its behalf by OTA. Subsequently, the JCT staff ran the OTA model on the Treasury Department's computer. In the mid-1970s, the JCT brought the OTA model in-house and, since then, has modified some of the aging and imputation routines. Many other public and private agencies maintain tax policy microsimulation models, including more than half the states; a number of private research organizations concerned with the effects of federal and state taxation policies, such as the Brookings Institution and the National Bureau of Economic Research; several private economic consulting firms; and the governments of most Western nations. And, in addition to OTA and JCT, other U.S. agencies with tax models include:

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations the Congressional Budget Office, which has conducted several studies of the effects of tax policies on the distribution of family and personal income using microsimulation techniques (see, e.g., Kasten and Sammartino, 1990); ASPE, which invested substantial resources in the early and mid-1980s to enhance the federal and state income tax modules in TRIM2 (see Weinberg, 1987, for an analysis of the distributional effects of the 1986 Tax Reform Act based on running the TRIM2 model) and also recently made extensive use of the TRIM2 federal tax module to simulate proposed changes to the child care tax credit; and the Census Bureau, which uses its model to estimate the distribution of after-tax income rather than for policy analysis purposes (see Bureau of the Census, 1988). Although U.S. tax policy models share features in common with income support models, they also have several distinguishing characteristics. Currently used tax policy models fall into the class of cross-sectional microsimulation models that simulate policy effects on a population database at a given time and use static aging techniques to project the database forward in time.35 In general, the models are very elaborate tax liability calculators that incorporate very few explicit behavioral responses. Most models assume that taxpayers will choose whether or not to itemize deductions on the basis of which alternative reduces their tax liability, but other types of behavioral response, to the extent that they are considered, are usually handled outside the model.36 One distinguishing feature of the tax models is their very heavy reliance on information from administrative records, specifically, from the SOI samples of tax returns. Each year, the Statistics of Income Division of the Internal Revenue Service (IRS) obtains a random, stratified sample of individual income tax returns, which it processes to produce published statistical series and analyses (see Coleman, 1988). (The division also provides tabulations and analyses of corporate tax returns, partnership returns, and estate and gift tax returns, as well as other specialized tabulations based on IRS records.) In odd years, the division obtains a much larger sample of individual returns and also keys in additional variables. It is these biennial detailed files that are most heavily used in tax modeling and tax policy research. Tax models are also distinguished from income support models, at least in degree, if not in kind, by their very heavy reliance on imputations, statistical 35   In the future, the Statistics of Income samples that provide input data for the tax models (see the description of the SOI in the text) will include an embedded panel: that is, tax returns of some individuals will be sampled each year on a continuing basis. The Office of Tax Analysis hopes to develop a dynamic model to use these panel data. 36   From time to time, important behavioral responses, such as changes in handling assets in response to changes in capital gains tax rates and changes in charitable deductions in response to changes in marginal tax rates, have been simulated within rather than outside of tax policy models.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations matching, and complex static aging techniques. More extensive use of imputation and aging techniques is needed for tax modeling because the SOI data generally become available with a lag of 2-3 years and lack information, such as family characteristics, that is important for evaluating alternative tax proposals (see further discussion, below). The March CPS provides some of the missing information and is available on a more timely basis, but it is not suited to stand on its own as a tax policy analysis database. It lacks information on deductions, some components of income (such as capital gains), and other variables (such as net operating loss carryovers) that are critical to calculating tax liability. The March CPS also "topcodes" income amounts for high-income people (the top income category shown is $100,000 or more) and has other problems from a tax modeling perspective. One can divide the family of U.S. tax models into at least three main categories on the basis of their principal purpose and data source. The first category includes the very detailed tax model used by the OTA and the JCT. The primary purpose of this model is to prepare revenue estimates for proposed legislative changes to the federal individual income tax. The model is also frequently used to examine the distributional effects of tax law changes. The OTA/JCT model is built on information from the detailed SOI files that have not been modified for public use, with additional variables imputed or matched from the March CPS and other sources. The second category includes tax models that start with the public-use SOI files and impute or match additional information from household surveys such as the March CPS. The primary purpose of these models is to examine broad issues surrounding the distributional effects of government tax policies. The models maintained by the Brookings Institution and the National Bureau of Economic Research are in this category. The third category includes tax simulation routines that are embedded in models of income support programs and that, consequently, start with a March CPS database to which information from the public-use SOI files is imputed. Models such as TRIM2 fall into this category. The tax routines of this class of models are also typically used to study broad distributional effects of tax policy changes, sometimes in conjunction with the transfer program routines. Because OTA and JCT have a need for, and also have access to, the detailed SOI files, the OTA/JCT model is more elaborate than any of the other tax policy simulation models. With regard to the differences between the detailed and public-use SOI files (see Statistics of Income Division, no date), the public-use files are processed so that names, addresses, and social security numbers are deleted from all records; records sampled at 100 percent from the full IRS file of tax returns (which are returns with very high incomes or very high losses) are subsampled at a 33 percent rate in the public-use SOI files; state codes and other geographic indicators are deleted from records with adjusted gross income of $200,000 or more and, in addition, other codes (such

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations as exemptions and alimony paid or received) are removed or modified for these records; wage and salary income, state and local income tax deductions, and real estate tax deductions are blurred for high-income records; state and local tax deductions and alimony paid or received are blurred for all other records; and many detailed items reported by small sets of taxpayers are deleted (e.g., preferences under the minimum tax). Developing a Tax Model Database: The Case of OTA To illustrate the array of problems and the extensive procedures involved in developing a suitable database for tax modeling, particularly at the level of detail required for estimating federal revenues, we describe the process of generating the database for version 11 of the OTA model (see Cilke and Wyscarver, 1990). This version of the model implemented several major improvements over earlier versions, including new aging routines. Converting the model from a UNIVAC mainframe to a VAX minicomputer provided greater computational power and speed that made possible enhancements to the model's capabilities. To develop version 11, OTA began with the 1985 SOI sample, containing about 121,000 individual income tax returns, each of which had more than 1,000 data fields. OTA deleted unnecessary fields, created necessary recodes, packed the data to minimize computer storage requirements, and verified that the taxpayer's liability calculated by using the OTA model agreed with the liability reported in the SOI file. For the 5 percent of returns for which the reported and simulated tax liabilities did not agree, OTA examined the records and edited one or more variables as needed. OTA next imputed additional variables. For example, because not all taxpayers itemize deductions, but some proposed changes in tax law might induce some current nonitemizers to itemize, OTA imputes deductible expenses for nonitemizing taxpayers. (Such expenses include charitable contributions, real estate taxes, home mortgage and other categories of interest, medical expenditures, and others.) Imputations were performed in some cases from regression equations estimated from the Survey of Consumer Finances and, in other cases, from probabilities estimated for itemizers in the SOI file itself. As another example, OTA imputed the earnings attributable to husbands and wives, for married couples filing jointly, based on tax return information provided in support of the two-earner deduction for working couples. Such information is needed to assess tax provisions that impose a so-called marriage penalty and other analyses, but it was not available for many working couples except for the 1982-1986 period when the two-earner deduction was in effect. OTA then exactly matched a file containing each taxpayer's month and year of birth (from social security records) with the SOI tax return sample. The next step was to reduce the size of the SOI sample in order to improve processing efficiency. This procedure involved an optimization model designed

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations to select about 89,000 observations from the full file of 121,000 tax returns in such a manner as to minimize the total loss of information when measured by an explicit loss function. Yet (Cilke and Wyscarver, 1990:2-8), In spite of vast improvements…to the individual tax model over the years, one shortcoming still remains: the data base relies almost entirely on tax return data. Analyzing proposals that could radically alter the tax base, the tax unit, and the tax rates requires information that is not tied to a particular tax law or limited to what is reported on tax returns filed under that tax regime. Additional information that OTA needed includes: (1) sources of income and types of expenditures not subject to taxation under current law; (2) links between taxpayers and family or household units, so that examination of tax burdens can be based on logical economic units; (3) information on people whose incomes are too low to require filing a return under current law but who might, under some proposals such as a consumption tax scheme, become tax filers; and (4) information necessary to construct a comprehensive income measure, such as consumption plus change in net worth, that permits an analysis of the distributional effects of tax reform proposals for a broader concept than that of taxable income. OTA's approach to obtaining much of the additional information needed for version 11 of its tax model was to carry out a statistical match of the March 1986 CPS and SOI records. The first step in this process was to apply tax filing unit provisions to each CPS person aged 16 and over in order to convert CPS households into one or more potential tax filing units or potential nonfiling units with low incomes. Corrections for underreporting of certain forms of income in the March CPS were also made by using data from the National Income and Product Accounts and other sources. The SOI file of taxpayers age 16 and older and the March CPS file of potential filers were then statistically matched on the basis of a common set of core variables, including marital status, age, number of children, housing tenure (owner or renter), and the presence or absence and amount of various sources of income. The CPS nonfilers were then appended to the matched file, and the latter file was sorted by CPS family number. Finally, taxpayers under age 16 from the SOI file were appended to families judged to have appropriate income and demographic characteristics. The resulting file size expanded to about 213,000 records, given the addition of nonfilers from the March CPS, taxpayers under age 16 from the SOI, and the need to match some CPS records to more than one SOI record and vice versa in order to preserve the properties of each of the two samples. In particular, the March CPS sample is very thin in the number of cases of high-income tax filing units in comparison with the SOI. The last stage in the process of creating the OTA tax model database

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations involved several series of imputations. One series was undertaken to provide information needed to simulate the 1986 Tax Reform Act (TRA). Imputations required for this purpose included tax-exempt interest, health insurance expenses for self-employed people, meal and entertainment expenses, adjustments to pension income, contributions to 401(k) retirement plans, net operating losses carried over to future tax years, employer contributions to pension plans, and active and passive income by source. A variety of data sources were used to develop the TRA imputations. A new set of parameters to simulate the TRA provisions also had to be incorporated into the tax calculator portion of the model. Another set of imputations was performed to generate information needed to simulate tax liabilities for catastrophic health insurance coverage under Medicare. These imputations illustrate the additional database enhancement that is frequently required to model new provisions of the tax code. Required imputations in this instance included Medicare eligibility for the tax unit head and spouse, employer contributions to Medigap insurance, reduced medical expense deductions because of provided catastrophic health insurance, pension income by source for the unit head and spouse, and social security income for the unit head and spouse. Repeal of the catastrophic health insurance program in 1989 obviated the need for such information at the present time, although Congress may in the future consider a similar type of plan. Finally, a series of imputations was undertaken to provide information to construct the broad concept of family economic income used by OTA in analyzing the distributional effects of proposed tax law changes on income classes of the population. For example, untaxed income sources (such as AFDC and SSI), employer-provided benefits, and exclusions from taxable income (such as 401(k) contributions) had to be imputed and added to adjusted gross income for filers and added to estimated adjusted gross income for nonfilers. The result of all of these operations was a greatly enhanced database for modeling tax policies as of 1985. However, because OTA must develop projections for the current year and a minimum of 5 years into the future, it has developed and regularly employs static aging procedures. The aging procedure for version 11 of the OTA model, which reflects a major effort to improve this part of the model, generates a new database for each year from 1986 through 1995 and is implemented in two stages (see Gillette, 1989).37 The procedure is performed each time a simulation is carried out and is under the control of the user, who can select one of three aging alternatives developed by OTA to represent different expected rates of economic growth. The first stage in the aging process applies growth factors on each dollar amount in the database to reflect actual and projected per capita real growth 37   Prior versions of the model aged the database only to the current year; projections to future years were developed through an out-of-model extrapolation technique.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations and inflation. The second stage adjusts the weights of each family head in the file to hit aggregate targets for 33 different variables, such as adjusted gross income by income class and type of return, for which the targets are chosen to be consistent with forecasts of national income, population, and inflation. The weight adjustment is carried out in a series of iterations designed to hit the targets while minimizing a loss function. Finally, OTA has included in the aging procedures an optional special adjustment to simulate the behavioral response of taxpayers to the increase in the top tax rate on long-term capital gains resulting from the 1986 Tax Reform Act. The simulation changes the level of realizations of long-term capital gains as a function of the difference in marginal tax rates before and after tax reform and the prereform level of realized gains. Other behavioral responses to increased tax rates on capital gains, such as changing holding periods, shifting mixes of capital assets, and converting ordinary income to (or from) capital gains, are generally estimated outside the model. Issues in Modeling Tax Policies Clearly, the use of SOI tax return information in tax policy microsimulation models is beneficial and, indeed, essential for detailed simulation of the revenue consequences of proposed changes to the tax code. The SOI files provide documented sets of income amounts, deductions, and other tax-related variables for large samples of actual filing units and thereby portray all of the detail of the current tax code. Just as clearly, however, the SOI files are inadequate in many ways. They lack variables necessary to characterize fully the socioeconomic status of households and families for use in analyses of the distributional effects of tax policy changes. They also lack variables needed to simulate proposed tax reforms that broaden the base of taxable income, expand the types of allowable deductions and credits, change the definition of a tax filing unit, or in other ways significantly alter the tax code. Users of the SOI data alone for tax policy modeling are at the mercy of changes in the tax laws that not only may require new imputations but also add, or delete, useful information. For example, the information required to support two-earner deductions for working couples from 1982 through 1986 provided a good basis for determining the share of wages attributable to each spouse; however, this information is no longer available.38 To obtain the range of needed variables, tax policy modelers must confront a heavy task of extensive data creation using imputation and matching techniques. Imputations for missing variables are also required for simulations of current and proposed income support programs, but not to the degree that is usually required for detailed simulation of proposed changes to the tax code. 38   Other administrative records, such as social security earnings records, could provide information on spousal wages through an exact matching procedure, but such information is not available on public use tapes.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations A factor that complicates the task of creating a tax policy simulation database is that tax return data differ in important ways from the household survey data in the March CPS and other files that are often used in matching and imputation. For example, there is a difference between tax filing units and households or families. Data on filing units are required to estimate tax liabilities, but data on households are most often required for distributional analyses. Recent changes in the tax law that require social security numbers for most dependents should make it easier to reconstruct family units from tax returns when dependents file separately and, thereby, improve the match with household survey records. However, the tax return data cannot, by definition, provide information on families that are not currently required to file tax returns. Survey data are needed to supply information about nonfilers, and one or more potential tax filing units must be constructed for these cases. Another important difference that affects the quality of matches of household survey data with tax return files is the imbalance in the distribution of records by income class in the two samples. Household surveys such as the March CPS contain many more low-income and many fewer high-income records in comparison with the SOI data. Moreover, survey data for high-income people are typically top coded to protect confidentiality. These differences constrain the ability of the matching algorithm to find good matches. Still another important difference between SOI and household survey data concerns underreporting (and misreporting) of income. Underreporting affects both types of data, but the opportunities to underreport (or misreport) differ for different types of income and, thus, differentially affect tax returns and household surveys. For example, W-2 forms must support reports of wages and salaries to the IRS, but this is not the case for household surveys.39 The growing use of 1099 forms by providers of interest, dividends, and other types of nonwage income presumably encourages more complete reporting to the IRS, while these kinds of income sources are of notoriously poor quality in surveys. However, reports of income from odd jobs and cash transactions may be better, or at least no worse, in surveys than on tax returns. As noted above, OTA, as part of creating its tax model database, adjusts the March CPS income amounts to national targets. One could also use tax compliance data to correct the SOI figures for the entire population of filers. However, if one wants to model income tax collections, one should only implement corrections for the small number of people who are actually audited. In any case, the differences in mechanisms underlying income reporting in administrative and survey data bear examination to determine their effects on the quality of the resulting merged database. 39   Although wages and salaries are presumably reported more accurately in tax returns than in surveys, definitional problems plague comparisons for this as well as other types of income. Thus, contributions to so-called 401(k) pension plans are legitimately excluded from taxable wage and salary income and, hence, from W-2 forms; they may or may not be reported in surveys.

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations One of the original goals of SIPP was to support improved modeling of tax as well as transfer policies. Each SIPP panel to date has annually included a module that obtains information about tax-related variables, including yearly income amounts from employment and assets, type of tax return filed, number and type of exemptions, type and amount of deductions and credits, and taxes paid. Clearly, such information, together with the improved reporting of income in SIPP, would be valuable for tax policy models that currently start with the March CPS and for models, such as OTA's, that match CPS with SOI data. However, small sample sizes and confidentiality restrictions have severely limited the utility of the SIPP tax information to date. Only a few variables from the tax module are included in files that are available for research and policy analysis use (no information on amounts is made available), and these files are obtainable only under special access arrangements. One way to assess and improve the quality of tax modeling databases would be to conduct exact matches, based on social security numbers, of household survey records from the March CPS or SIPP with tax return data from the full IRS files. Confidentiality restrictions may well preclude access to such a matched file, but the Statistics of Income Division and the Census Bureau could explore the possibility of conducting exact matches for use solely in evaluating and enhancing the quality of the statistical matches and imputations that are currently done to construct tax policy analysis databases. Confidentiality concerns generally are a major and growing problem for access to tax return data for research and analysis purposes. More and more data on the public-use versions of the SOI files are being blurred. For example, deductions for state taxes are blurred because some states disclose information such as the amount of property taxes paid. (No state codes are provided at all for returns with adjusted gross income of $200,000 or more.) Even if access problems are solved, tax modeling still requires substantial data generation. For example, to support comprehensive modeling of the incidence of taxes, one needs to impute consumption so that one can model excise and sales taxes. One also needs to impute all data for nonfilers. These are big imputation jobs. Much more attention needs to be paid to imputation procedures and their quality, particularly for models that try to assess tax incidence broadly. More sensitivity analysis of imputation techniques is needed. OTA has commissioned some work on this topic (see, e.g., Kadane, 1978), but more is needed, for example, on the impact of imputations on joint distributions of variables. The need for updating and projecting the information required for tax policy modeling further increases the difficulties of data generation and adds uncertainties, which could be substantial, to the estimates. Just as imputation procedures generally need much more extensive evaluation, so do the procedures used by tax models to age their databases. Finally, the issue of behavioral response is serious for tax models because

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations behavioral effects of tax policy changes affect almost the entire population. One is not dealing with a subset of the population, such as welfare recipients, for which behavioral effects may be relatively unimportant. Changes in marginal tax rates undoubtedly affect behavior across the population and also have important second-round effects. To model behavioral responses in a satisfactory manner, one would need to link tax models with other models of wage rate and factor price determination. Data from panel surveys that follow people over time are also needed for development of good estimates of behavioral responses to such changes as the tax treatment of capital gains. Another important area of research concerns ways to narrow the range of parameter estimates and address other technical considerations in developing and implementing behavioral response functions in microsimulation models. An added issue to consider concerns presentation of estimates of first-round and second-round effects of tax policy changes. One can make a real case for using models for short-term estimates and then discussing possible longer-run behavioral responses in qualitative terms. This was the approach followed by the Treasury Department in the 1984-1985 tax reform debate leading to the 1986 Tax Reform Act. Given that our review concentrated principally on models of government expenditure programs in the social welfare area, we do not offer explicit recommendations for tax policy models. However, we believe that there are at least three priority needs for attention. One important need is to address confidentiality and data access issues for tax return information and, specifically, to develop ways to take advantage of exact matches of survey and administrative records information to improve the quality of tax model databases. The second important need is to improve the capability for estimating the effects of behavioral responses to tax policy changes, whether these estimates are developed inside or outside the tax policy simulation models themselves. The third, and perhaps the most important, need is to conduct sensitivity analyses and other validation studies of the extensive array of imputations and matches that are performed in creating databases for tax modeling and of the procedures used to age these databases. We are concerned that the imputation and aging procedures, regardless of how much care is taken in implementing them, may distort important covariances for individual filing units. Under the current methodology, each imputation is performed individually to meet aggregate control totals and other known information about the distribution of the item by income class. When several imputations are done in this fashion, however, the effect may be to cause variations among individual units within income classes that reflect the imputations more than the real world. The imputation procedures may not have adverse effects on the quality of comparisons involving vertical equity, that is, assessment of the effects of tax policy changes on different income classes; however, they may well adversely affect the quality of comparisons involving horizontal equity, that is,

OCR for page 194
Improving Information for Social Policy Decisions: The Uses of Microsimulation Modeling, Volume I - Review and Recommendations assessment of the impact of tax policy changes on taxpayers in similar economic circumstances. These needs are important not only for microsimulation models of tax policies, but also for models in other policy areas. We encourage a broad range of policy analysis agencies to work together to make the needed improvements in microsimulation models.