Read "Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations" at NAP.edu

Page 290 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

Appendix
Microsimulation Models, Databases, and Modeling Terms

This appendix details the characteristics of selected social welfare policy microsimulation models, including DYNASIM2, HITSM, MATH, MRPIS, PRISM, the long-term health care PRISM submodel, the model of the Office of Tax Analysis, and TRIM2; the major databases that the models use; and the technical terms used in microsimulation modeling. Entries include references for additional information. Terms are defined with reference to their application in social welfare policy modeling.

MODELS

DYNASIM2 (Dynamic Simulation of Income Model 2)

Type Public-use dynamic model for retirement income and other social processes and policies.

Supplier The Urban Institute; original DYNASIM developed in early 1970s, and DYNASIM2 developed in early 1980s.

Major Users Congressional Budget Office; U.S. Department of Labor; in the U.S. Department of Health and Human Services, ASPE, SSA, Administration on Aging, National Institute of Child Health and Human Development.

Programs Simulated SSI, social security, employer pensions, individual retirement accounts, social security payroll tax, federal income tax.

Page 291 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

Main Database Exact match of March 1973 CPS with social security earnings records for 1937-1972 (quarters of coverage) and 1951-1972 (taxable earnings).

Major Database Enhancements Cross-sectional imputation of home ownership and income from financial assets for older families, based on 1984 Survey of Consumer Finance, and of health status and institutionalization for older people, based on 1984 Supplement on Aging and 1984 Long-Term Care Survey.

Projection Strategy Uses dynamic aging to develop longitudinal histories for all sample individuals to any future year. Events simulated include birth, death, first marriage, divorce, remarriage, work disability, education, migration (by region and size of place), wage rate, labor force participation, hours of work, unemployment, job change, industry movement, social security coverage, pension coverage. Uses many data sources to estimate transition probabilities, including vital statistics, decennial census, BLS Surveys of Defined Benefit Plans, May CPS (selected years), Panel Study of Income Dynamics.

Behavioral Responses Simulated Basic participation decision (for SSI) and decision to retire and accept public or private retirement benefit, but no feedback effects of simulated future program changes (e.g., on hours of work or savings behavior).

Calibration of Yearly Histories Generally uses most recent alternative II-B assumptions from the OASDI trustees report to calibrate demographic and economic aggregates and BLS projections to calibrate employment aggregates. Computer Implementation Hardware is IBM mainframe, VAX minicomputer, software is FORTRAN.

References Johnson, Wertheimer, and Zedlewski (1983); Johnson and Zedlewski (1982); Orcutt et al. (1980); Ross (in Volume II); Zedlewski (1990).

HITSM (Household Income and Tax Simulation Model)

Type Proprietary static income-support and tax program model.

Supplier Lewin/IlCF, Inc. (formerly ICF, Inc.); developed in mid-1980s.

Major Users U.S. Department of Health and Human Services.

Programs Simulated AFDC, SSI, food stamps, energy assistance, unemployment insurance, in-kind benefits from Medicare, Medicaid, housing assistance, and school lunch programs, federal income tax, social security payroll tax, state income tax, sales tax.

Main Database CPS March income supplement.

Major Database Enhancements Statistical match of the 1987 March CPS,

Page 292 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

the 1984 SOI Individual Income Tax Return Sample, and the 1984 Consumer Expenditure Survey (CES); allocation of yearly income and employment to weeks and months (see Citro and Ross, in Volume II); adjustment for hours worked and earnings to match BLS and Census Bureau control totals; adjustment for underreporting of unearned income (except for AFDC and SSI, which are simulated by the model) by a hot deck to select additional recipients followed by scaling of income amounts to match control totals; imputation of value of employer-provided health insurance.

Projection Strategy Static aging of demographic characteristics, employment, and income amounts is integrated into the model and always carried out prior to program simulations.

Behavioral Responses Simulated Basic program participation decision (see Citro and Ross, in Volume II).

Calibration of Baseline Simulations AFDC, SSI, and food stamp participants are controlled to administrative data on national characteristics and to state caseloads for SSI and, optionally, for AFDC.

Computer Implementation Hardware is mainframe IBM; software is FORTRAN.

References ICF, Inc. (1987); Lewin/ICF, Inc. (1988).

MATH (Micro Analysis of Transfers to Households)

Type Public-use static income-support program model.

Supplier Mathematica Policy Research, Inc.; developed in mid-1970s, based on an early version of TRIM, and redesigned for processing efficiency in late 1970s and again in mid-1980s.

Major Users Food and Nutrition Service, U.S. Department of Agriculture; applications also performed for Congressional Research Service, U.S. Department of Labor, U.S. Department of Health and Human Services.

Programs Simulated AFDC, SSI, general assistance, food stamps, federal income tax, social security payroll tax; the food stamp module is the most well-developed (Medicaid and state tax modules were developed in the 1970s but are no longer used).

Main Database CPS March income supplement.

Major Database Enhancements Allocation of yearly income and employment to months based on the 1979 ISDP (see Citro and Ross, in Volume II); imputation of child care expenses based on the 1985 SIPP panel, shelter expenses based on the 1983 AHS, out-of-pocket medical expenses based on the 1980-1981 CES,

Page 293 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

asset balances based on the 1985 SIPP panel, federal income tax deductions. There is an optional procedure to adjust for income underreporting.

Projection Strategy Usual practice is to age a recent March CPS file forward about 4 years, using static procedures to adjust demographic, economic, and employment characteristics; process is repeated every 2 or 3 years.

Behavioral Responses Simulated Basic participation decision in AFDC, SSI, general assistance, and food stamp program and participation response to change in food stamp benefits (see Citro and Ross, in Volume II); modules to simulate labor supply response to income benefits and take-up rates for public service jobs programs were developed in the late 1970s but are no longer used.

Calibration of Baseline Simulations Food stamp participants are controlled to administrative data on national characteristics. AFDC and SSI participants are controlled on the basis of regional characteristics, with optional state controls.

Computer Implementation Hardware is mainframe IBM; software is FORTRAN, Assembly (can output an analysis file to a personal computer and create tables using any statistical package).

References Beebout (1980); Doyle and Trippe (1989); Doyle et al. (1990).

MRPIS (Multi-Regional Policy Impact Simulation)

Type Public-use hybrid second-round effects model for income support and tax programs and other public- and private-sector economic changes.

Supplier Social Welfare Research Institute, Boston College; developed in early and mid-1980s.

Major Users U.S. Department of Health and Human Services; state of Massachusetts; private firms and nonprofit associations.

Programs Simulated AFDC, unemployment insurance, federal income tax, social security payroll tax, state income tax.

Main Databases For the household sector, March CPS; for the product market sector, CES data calibrated with NIPA consumption data; for the industrial sector, the 1977 multiregional input-output accounts; for the labor market sector, matched CPS files.

Structure The model consists of four interrelated sectors, household, product market, industrial, and labor market; it does not include a capital market, investment behavior, or a detailed financial sector. It is a short- or intermediate-term partial equilibrium model that simulates alternative economic states, but does not simulate the time path from the baseline to any alternative. It assumes

Page 294 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

elastic supplies of commodities, labor, and capital, but fixed prices and wage rates.

The household sector is microsimulation based. It simulates tax and transfer programs and then calculates the changes for gross and disposable family income. The change values are aggregated into 20 gross family income categories by region—50 states and the District of Columbia.

The product market sector is cell based. It uses marginal propensities to consume to apportion income changes by income class and region into current savings and consumption and then uses marginal budget shares to allocate the changes in consumption to 73 categories.

The industrial sector is input-output based. It translates changes in demand from the household sector and from a simulation of government demand into total change in output—direct and indirect—needed to satisfy that demand for 124 industrial categories and 51 regions.

The labor market sector is cell based. It translates changes in industry output into labor demand and then allocates hours of labor demand by industry and occupation—11 and 8 categories, respectively—and region to the individuals in the household sector, using transition probabilities for the propensity of members of various demographic groups to change hours worked or industry or occupation. Changes in hours worked are then translated into changes in wage income that produce the basic input for subsequent (multiplier) iterations of the model.

Computer Implementation Hardware is mainframe IBM; software is FORTRAN.

References Havens et al. (1985); Havens and Clayton-Matthews (1989); Social Welfare Research Institute (no date).

PRISM (Pension and Retirement Income Simulation Model)

Type Public-use dynamic retirement income program model.

Supplier Lewin/ICF, Inc. (formerly ICF, Inc.); developed as proprietary model in 1980 for the President's Commission on Pension Reform.

Major Users U.S. Department of Labor; U.S. Department of Health and Human Services; Social Security Advisory Council; Congressional Budget Office.

Programs Simulated SSI, social security, employer pensions, individual retirement accounts, federal income tax, social security payroll tax, state income tax.

Main Database Exact match of March 1978 CPS with social security earnings records for 1937-1977 (quarters of coverage) and 1951-1977 (taxable earnings),

Page 295 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

augmented with pension and other data from exact matches to the March and May 1979 CPS; second major database contains detailed information on a sample of public and private retirement plan sponsors.

Major Database Enhancements Imputation, based on regression equations and statistical matching, for missing pension, labor force, or earnings variables on 8,000 of 28,000 adult records in the CPS-SSA database.

Projection Strategy Uses dynamic aging to develop longitudinal histories to the year 2025, for adults. Events simulated include death, birth of child, marriage, divorce, disability, nursing home use, annual hours of work, hourly wage, job change, industry, pension coverage, pension plan assignment, pension acceptance, social security acceptance, IRA adoption and contributions. Uses many data sources to estimate transition probabilities, including vital statistics, matched March CPS files, and January, March, and May CPS (selected years).

Behavioral Responses Simulated Basic participation decision for SSI and decision to retire—partial as well as full retirement—and accept public or private retirement benefit, but no feedback effects of simulated future program changes (e.g., on hours of work or savings behavior).

Calibration of Yearly Histories Uses alternative II-B assumptions from the OASDI trustees report to calibrate demographic aggregates, average wage, unemployment, interest rates, inflation rates, and wage growth; uses BLS projections to control employment levels by industry and labor force participation; also uses output from the ICF Macroeconomic-Demographic Model (or other macroeconomic models) to control labor force participation and earnings.

Computer Implementation Hardware is mainframe IBM; software is FORTRAN.

References Kennell and Sheils (1986, 1990); Ross (in Volume II).

Long-Term Care Financing Model (Submodel of PRISM)

Type Public-use dynamic model of long-term care utilization and financing for the elderly.

Supplier Lewin/ICF, Inc. (formerly ICF, Inc.), and the Brookings Institution; submodel was added to PRISM in 1986; originally model was proprietary.

Major Users U.S. Department of Health and Human Services; state of Hawaii; foundations.

Programs Simulated Alternative public and private insurance and other programs for financing long-term care for the elderly.

Main Database PRISM database (see description of PRISM).

Page 296 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

Major Database Enhancements Statistical match with the 1984 SIPP panel, adjusted for inflation, to assign asset records to all elderly people.

Projection Strategy Uses dynamic aging to develop longitudinal histories to the year 2025 for adults age 65 and over. The PRISM component handles aging of demographic, employment, income, and assets variables; the submodel adds aging of disability status, use of nursing home and home health care services, and public and private sources of financing long-term care. Data sources for transition probabilities for the submodel include the 1984 SIPP panel, the 1982-1984 National Long-Term Care Survey, the 1985 National Nursing Home Survey, and Medicare and Medicaid administrative data from HCFA.

Behavioral Responses Simulated See description of PRISM; the submodel simulates behavioral responses to changes in health care financing programs (such as increased utilization in response to more generous benefits) through user-specified parameters.

Calibration of Yearly Histories See description of PRISM.

Computer Implementation Hardware is mainframe IBM or VAX minicomputer; software is FORTRAN.

References Kennell and Sheils (1986); Kennell et al. (1988); Rivlin and Wiener (1988).

Treasury Individual Income Tax Simulation Model (OTA Model)

Type In-house static tax policy model.

Supplier Office of Tax Analysis, U.S. Department of the Treasury; developed in early 1970s, based on work at the Brookings Institution extending back to the early 1960s.

Major Users Office of Tax Analysis; Joint Committee on Taxation, U.S. Congress.

Programs Simulated Federal income tax, social security payroll tax.

Main Database Statistics of Income samples of individual income tax returns.

Major Database Enhancements Imputations of deductible expenses for nonitemizing taxpayers and earnings attributable to husbands and wives; statistical match with the CES to obtain information on consumption; statistical match with the CPS March income supplement to obtain sources of income not currently subject to taxation, links between taxpayers and family or household units, and information on low-income people not required to file a return under current law; imputations to simulate the 1986 Tax Reform Act and the 1989 Omnibus Budget Reconciliation Act, simulate catastrophic health insurance

Page 297 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

coverage under Medicare, and construct a broad concept of family economic income.

Projection Strategy Includes a two-stage static routine to update and project the database for a total of 10 years, with three alternatives representing different expected rates of economic growth. The first stage applies growth factors on each dollar amount to reflect actual and projected per capita real growth and inflation; the second stage adjusts weights of each family head to hit aggregate targets for 33 different variables, such as adjusted gross income class and type of return.

Behavioral Responses Simulated The decision to itemize is simulated. There is also an optional special adjustment in the aging procedures to simulate a behavioral response of taxpayers to the increase in the top tax rate on long-term capital gains resulting from the 1986 Tax Reform Act. Other behavioral responses are generally estimated outside the model.

Calibration of Baseline Simulations Not needed, as the database is a sample of actual tax returns filed. The model's simulation of tax liability is checked for agreement with reported liability, and edits are performed for the small percentage of returns with discrepancies.

Computer Implementation Hardware is VAX minicomputer; software is FORTRAN.

References Cilke and Wyscarver (1990); Gillette (1989).

TRIM2 (Transfer Income Model 2)

Type Public-use static income-support and tax program model.

Supplier The Urban Institute; original TRIM developed in early 1970s, and TRIM2 developed in the late 1970s.

Major Users Office of the Assistant Secretary for Planning and Evaluation, U.S. Department of Health and Human Services; Congressional Budget Office; applications also performed for U.S. Department of Labor.

Programs Simulated AFDC, SSI, food stamps, school nutrition programs, Medicare, Medicaid (for the noninstitutionalized population), employer-sponsored health insurance, federal income tax, social security payroll tax, state income tax. The AFDC, SSI, and tax modules are the most well developed.

Main Database CPS March income supplement.

Major Database Enhancements Allocation of yearly income and employment to months (see Citro and Ross, in Volume II); imputation of child care and work-related deductions for AFDC from CES data; imputation of deductible expenses

Page 298 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

for the food stamp program (dependent care, shelter, and medical) based on the 1976 and 1978 Surveys of Characteristics of Food Stamp Households; imputation of deductions and other variables for simulating income taxes; estimation of Medicaid costs based on HCFA tape-to-tape and other data; match of various surveys of employment-based health insurance plans to employees on the March CPS.

Projection Strategy Contains static aging routines; however, usual practice is to simulate current programs and alternatives on the latest available database and apply percentage differences to independently developed projections for current program law.

Behavioral Responses Simulated Basic program participation decision (see Citro and Ross, in Volume II) and also the decision to itemize taxes; a module to simulate labor supply response to income benefits is under development.

Calibration of Baseline Simulations AFDC participants are controlled to administrative data on state caseloads and a few national characteristics; SSI and food stamp participants are controlled to administrative data on national characteristics and some state-level targets, including number of units; federal income tax deductions and capital gains are calibrated to IRS totals by adjusted gross income class; Medicaid caseloads are calibrated to state-level enrollment and cost data.

Computer Implementation Hardware is mainframe IBM; software is FORTRAN, Assembly (creates output files for standard packages such as SAS).

References Webb et al. (1982, 1986); Webb, Michel, and Bergsman (1990); see also Cohen et al. (in Volume II).

DATABASES

Current Population Survey (CPS) March Income Supplement The CPS is a continuing monthly cross-sectional survey of a large sample of U.S. households. The survey's primary purpose is to collect data on labor force status for people aged 15 and older to permit determining the monthly unemployment rate for the nation and large states; it also provides annual average unemployment rates for all states. In most months, the survey includes supplemental questions on other topics; for over 40 years, the March CPS has included an extensive supplement of questions on income and employment status during the previous calendar year. The Bureau of Labor Statistics is the major sponsor of the CPS; the Census Bureau sponsors the March income and some other supplements, and other agencies occasionally provide funding for special supplements.

The current CPS sample size is about 60,000 households, and the March supplement includes another 2,500 households with at least one adult of Hispanic

Page 299 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

origin as of the previous November interview, plus a small number of households of Armed Forces members. Each household (more precisely, each address) is in the sample for 4 months, out for 8 months, and in for another 4 months. Information is obtained for all residents found at the interview; out-movers are not followed. Data collected in the regular interview include demographic characteristics; labor force participation, hours worked, reason for part-time work, reason for temporary absence from job, industry and occupation in the prior week; job search behavior in the previous four weeks if not working and when last worked; usual hours and usual earnings, union membership, reasons left last job, reasons not looking for work (for selected rotation groups). Data collected in the March supplement include labor force participation and job history in the prior calendar year; annual income for the prior year by detailed source, including earnings, self-employment, public and private transfers, and assets; participation in noncash benefit programs, including energy assistance, food stamps, public housing, school lunch; and health insurance coverage. See Chapter 5 and Citro (in Volume II) for uses of March CPS data for income support program and other policy modeling; see also Bureau of the Census (1990a).

Current Population Survey-Social Security Administration (CPS-SSA) Exact-Match Files CPS-SSA exact-match files, prepared by the Census Bureau and SSA, are matches of records in the March CPS with social security administrative records of quarters of coverage and taxable earnings under the OASDI program. The records are linked for the same individuals on the basis of social security number and other information. Two such files, the 1973 and 1978 CPS-SSA exact matches, have been used for policy analysis and research. (The 1973 file also included a match to IRS tax return data.) No CPS-SSA exact matches have been carried out since 1978, largely because of concerns about confidentiality. An exact match of the 1984 SIPP panel and SSA records was recently completed for research use by employees of SSA who were sworn in as Census Bureau agents. See Chapter 8 and Ross (in Volume II) for uses of CPS-SSA exact-match files for retirement income modeling; see also Subcommittee on Matching Techniques (1980).

Integrated Quality Control System (IQCS) The IQCS includes samples of administrative case records for the AFDC, food stamp, and Medicaid programs that are drawn each month by the states for use in evaluating the accuracy of the determination of eligibility and benefits for these programs. Sample sizes vary by state; total average monthly sample sizes are about 6,000 cases for AFDC and 7,500 cases for food stamps. Data abstracted from the records include case information (e.g., most recent opening, number of case members, gross countable income, net countable income); demographic characteristics for each person (e.g., relationship to head of household, age, sex, race, employment status); total household income by household member and type and amount of

Page 300 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

income. See Chapters 5 and 9, Citro (in Volume II), and Citro and Ross (in Volume II) for uses of IQCS data for calibrating and evaluating income-support program models; see also Family Support Administration (1988); Food and Nutrition Service (1988).

1977 National Medicare Care Expenditures Survey (NMCES), 1980 National Medical Care Utilization and Expenditure Survey (NMCUES), 1987 National Medical Expenditure Survey (NMES) The NMCES, sponsored by the National Center for Health Services Research with the National Center for Health Statistics, consisted of six rounds of data collection covering an 18-month period in 1977 and part of 1978 for a sample of 14,000 households. Surveys were also conducted of physicians and health care facilities providing care to members of the household sample during 1977 and of employers and insurance companies responsible for their insurance coverage. Data collected included expenditures and sources of payment for all major forms of medical care, demographic and socioeconomic characteristics of respondents, insurance coverage of respondents, information from medical providers about respondents, and access to medical care.

The NMCUES, sponsored by the National Center for Health Statistics with the Health Care Financing Administration, consisted of five rounds of data collection over a 15-month period for a national sample of 6,000 households and samples of 1,000 Medicaid cases each in New York, California, Texas, and Michigan. Medicare and Medicaid records were checked for the state samples to verify eligibility and obtain claims information. Data collected included health insurance coverage, episodes of illness, number of bed days, restricted activity days, hospital admissions, physician and dental visits, other medical care encounters, prescription purchases, access to medical care services, limitations of activities, income, demographic, and socioeconomic characteristics. For each contact with the medical care system, data were obtained on health conditions, characteristics of the provider, services provided, charges, sources, and amounts of payments.

The NMES, sponsored by the Agency for Health Care Policy and Research with the Health Care Financing Administration, consisted of five rounds of data collection between February 1987 and July 1988 for a sample of 14,000 households, including oversamples of blacks, Hispanics, people aged 65 and older, low-income people, and people with functional limitations. Surveys were also conducted of physicians and health care facilities providing care to members of the household sample during 1977 and of employers and insurance companies responsible for their insurance coverage. The NMES also included an institutional survey of 13,000 residents of nursing and personal care homes, psychiatric hospitals, and facilities for the mentally retarded. The data collected included utilization, expenditures, and sources of payment for all major forms of medical care, demographic and socioeconomic characteristics of respondents,

Page 301 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

insurance coverage of respondents, information from medical providers about respondents, and access to medical care.

See Chapter 8 for uses of NMCES-NMCUES-NMES data for health care policy modeling.

Statistics of Income (SOI) Individual Income Tax Returns The SOI individual income tax files are samples of tax returns and supporting schedules abstracted each year by the Statistics of Income Division of the IRS from approximately 100 million returns. Sample sizes are about 80,000 returns in even years and 120,000 returns in odd years. The sample is based on such criteria as principal business activity, presence or absence of a schedule, state from which filed, size of adjusted gross income (or loss) or largest specific income (or loss) items, and total assets or size of business and farm receipts. Recently, the design was altered to include an embedded longitudinal sample, that is, to draw a portion of the returns for the same taxpayers from year to year. Data abstracted pertain to taxpayers' income, exemptions, deductions, credits, and taxes owed (due to changes in tax laws, items vary from tax year to tax year). See Chapter 8 for use of the SOI data for tax policy modeling; see also Coleman (1988); Minarik (1980), Statistics of Income Division (no date).

Survey of Income and Program Participation (SIPP) SIPP is an ongoing panel survey of adults aged 15 and older in the civilian, noninstitutionalized population, sponsored by the Bureau of the Census. The survey follows all adults in originally interviewed households and includes children and other adults who reside with original sample members. The first panel began in fall 1983 and completed nine interviews (waves) at 4-month intervals with an initial sample of about 20,000 households. Subsequent panels began in February of each year with initial household sample sizes of about 13,500 (1985); 12,000 (1986-1989); 21,500 (1990); 14,000 (1991). These panels were completed or are planned for eight waves (1985); seven waves (1986, 1987); six waves (1988); three waves (1989); eight waves (1990, 1991).

The data collected for each interview include demographic characteristics; monthly information on labor force participation, job characteristics, and earnings; monthly information on detailed sources and amounts of income from public and private transfer payments, noncash benefits (such as food stamps, Medicaid, Medicare, and health insurance coverage); and information for the 4-month period on income from assets. Data are also collected in topical modules, which are asked once or twice in one or more panels, that cover a wide range of subjects, including: annual income and income taxes; child care and child support; educational financing and enrollment; eligibility for selected programs; employee benefits (1984 panel only); health and disability; housing costs and finance; individual retirement accounts; personal history (fertility, marital status, migration, welfare recipiency, and other topics); and wealth (property, retirement expectations and pension plan coverage, assets and liabilities). In

Page 302 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

addition, each panel includes a topical module with variable content designed to respond to the needs of policy analysis agencies. See Chapter 5 and Citro (in Volume II) for uses of SIPP data for income support program and other policy modeling; see also Allin and Doyle (1990); Committee on National Statistics (1989); Jabine, King, and Petroni (1990); Vaughan (1988).

SIPP was preceded by the Income Survey Development Program (ISDP), sponsored by ASPE and SSA. The ISDP conducted research on the design of a new income survey and sponsored several data collection efforts, including the 1979 ISDP Research Panel. The 1979 ISDP obtained data similar to SIPP for an initial sample of about 9,500 households (including oversamples of low-income and high-income households), who were interviewed six times at 3-month intervals; see David (1983).

MODELING TERMS

Aging This is the process of updating a database to represent current conditions or of projecting a database for 1 or more years to represent expected future conditions. See Chapter 6; Caldwell (1989); Cohen et al. (in Volume II); Ross (in Volume II).

Dynamic aging generally involves generating a database year by year through applying transition probabilities to the individual records in a cross-sectional database and recording the results of each year's simulation on the records. The result is an enhanced database that contains longitudinal histories: that is, values for each individual for each year of the simulation period. For any one year, the database can provide a cross-sectional representation of the population. For people in the sample each year, the process involves updating their age by one and changing many other variables according to outside, econometrically estimated transition probabilities and dynamic micro equations. The process includes creating new records for people who are simulated to be born and setting variables to null values for people who are simulated to die. For example, DYNASIM2 dynamically ages the following characteristics of the records in the database: birth, death, first marriage, remarriage, divorce, work disability, education, migration, wage rate, labor force participation, hours of work, unemployment, job change, industry movement, and pension coverage. Dynamic models typically calibrate their simulated longitudinal histories using aggregate population and economic growth assumptions from outside sources such as the Social Security Actuary's trust fund model. Dynamic aging is usually used to generate histories over 30 years or more for modeling retirement income programs and other policy issues with long time horizons; however, the method can be used to generate histories for any period, short or long.

Static aging generally involves adjusting the weights and selected variables in a cross-sectional database to represent the population in a future year

Page 303 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

according to outside projections. For example, the aging routines of the MATH and TRIM2 models reweight the records to match projections of the population by age, race, sex, and household composition (typically using the projections produced by the Census Bureau or the Social Security Actuary from cell-based models); adjust the income variables to match projections of inflation and real income growth (typically using projections from macroeconomic models); and adjust the labor force variables to match expected unemployment rates by age, race, and sex. The MATH and TRIM2 unemployment adjustment algorithm resembles dynamic aging techniques in that employed people are selected to experience unemployment (or vice versa), with other variables adjusted accordingly, on the basis of transition probabilities estimated using panel data. Static aging is typically carried out for a short period, 2-5 years; however, the method can be used to generate a cross-sectional database for any year, no matter how far into the future, provided the needed population and economic projections are available from outside sources.

Behavioral Response This term refers to a change in behavior of an individual decision unit, such as a family or hospital, in response to a policy change that, in turn, has feedback effects on program costs and recipients. For example, altering the level of cash or in-kind benefits (e.g., Medicaid) in the AFDC program may affect the work decisions of welfare recipients that, in turn, may increase or reduce AFDC costs and caseloads.

The immediate responses of individual economic units directly affected by a program change are termed first-round behavioral effects. There can also be second-round behavioral effects of a policy change: that is, effects that alter the nature of factor or product markets or the level and distribution of consumption, production, and employment in the economy or in a sector affected by the policy change. For example, a change in a transfer program that alters labor supply may change the wage rate in the labor market and therefore further change labor supply. In addition, in this example, in the short run, prior to an equilibrating change in the wage rate, the unemployment rate may be affected and displacement or replacement effects may occur. See Chapter 6; Burtless (1989); Grannemann (1989); Shoven and Whalley (1984); Strauss (1989); see also description of the MRPIS model, above.

Calibration Calibration is the process of adjusting simulation outputs to approximate control totals from outside sources. For example, yearly simulations from dynamic models such as DYNASIM2 and PRISM are calibrated to accord with the demographic and economic assumptions incorporated in the projections of the Social Security Actuary (see Ross, in Volume II). Also, baseline simulations of income support programs from models such as MATH and TRIM2 are calibrated so that the simulated participants approximate selected totals and characteristics of recipients from program administrative data, including the IQCS. Calibration methods vary. For example, MATH compares tabulations

Page 304 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

of eligible food stamp units from a baseline run with tabulations of recipients from administrative data and selects the needed number of participants in each category up to the maximum number of eligible units; if a category has too few eligible units, excess participants will be selected from another category. TRIM2 uses a profit equation to select AFDC participants from among eligible units on the basis of such characteristics as expected benefit level plus several parameters that are adjusted over the course of several runs so that simulated participants approximate caseloads by state and several characteristics of the national caseload. See Citro and Ross (in Volume II).

Cell-Based Model This type of model produces estimates of the effects of a program change for subgroups or cells that make up the population of interest. In a welfare or tax policy model, the cells might be socioeconomic classes, such as poor families headed by single mothers aged 25 to 34. In a model of the supply of health care, the cells might be classes of health care providers, such as hospitals in large central cities. The models use aggregate data corresponding to the cells and apply appropriate parameters, for example, the average tax rate by income class.

Cell-based models range from very limited models with a handful of cells to large systems with thousands of cells. In every case, in contrast to microsimulation models, they make the critical assumption that all elements within a subgroup behave in the same way. Examples of largely cell-based social welfare policy models are the ASPE Health Financing Model (Office of the Assistant Secretary for Planning and Evaluation, 1981) and the Lewin/ICF, Inc. Macroeconomic-Demographic Model (Anderson, 1990; Cartwright, 1989). The Social Security Actuary uses cell-based models to project the people expected to be eligible for social security benefits and those expected to pay social security taxes and to estimate the balances between expenditures and receipts in the social security trust fund. The Census Bureau uses a cell-based model to project the future size and composition of the population, taking into account expected rates of fertility, mortality, and net immigration. See Grummer-Strawn and Espenshade (in Volume II) for a review of studies evaluating the quality of the population projections produced by the Actuary and the Census Bureau, which are used in aging microsimulation model databases.

Computable General Equilibrium (CGE) Model This type of model simulates second-round behavioral effects of proposed policy changes (see behavioral response), specifically, the effects of a policy change on prices and quantities in the various markets of an economy, taking into account feedback effects between supply and demand. For example, an increase in transfer benefits may increase product demand in the economy, which in turn has an effect on employment and wages. Or, a decrease in marginal tax rates in the personal income tax may increase work effort, thereby leading to increased labor supply, lower wage rates, and higher employment in the labor market. CGE models are calibrated

Page 305 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

by a process of setting parameter values for supply and demand elasticities, drawn from the econometric literature or picked to fit the available data on market prices and quantities. Although CGE models simulate the equilibration of a full set of interconnected markets in an economy, which permits full long-run adjustment of prices to changes in supply and demand, they rarely provide guidance about the time horizon for full adjustment or the dynamic path of the adjustment process. CGE models as generally implemented are macro rather than micro in nature; however, it is possible to develop disaggregated CGE models or to link them with microsimulation models by iterating back and forth between them until market equilibrium is reached. See Berkovec and Fullerton (1989); Shoven and Whally (1984); Slemrod (1985); Whalley (1988).

Distributional Analysis This is the term used for tabulations produced by microsimulation models showing the simulation results disaggregated by subgroups of the population, such as households by income class or geographic area. Often, microsimulation models produce tabulations of gainers and losers for alternative policy proposals: that is, which population groups would gain and which would lose by a policy change as compared with the current program. For example, the model might produce tabulations of the number of, say, AFDC recipients under current law and one or more proposed alternatives by age of head (or other characteristic), showing, for each age category, the change in the number, plus or minus, for each alternative in comparison with the current program. Similarly, the model might produce tabulations of the change in average benefit (or tax) amount, up or down, for each category. See, for example, Beebout (1980:Table 2.8).

Dynamic Model This term refers to microsimulation models that generate a database of longitudinal histories for a population sample through means of applying transition probabilities to individual records and then use these histories to simulate alternative policies. Such models are able to follow the effects of demographic and economic processes and previous and proposed policy changes (e.g., raising the retirement age for social security). They draw heavily on behavioral research for their many transition probabilities, although current models include relatively few feedback effects of behavioral changes in response to simulated changes in government programs and policies. Dynamic models typically incorporate a hybrid of static and dynamic aging techniques, using dynamic techniques for most but not all variables; DYNASIM2, for example, includes static routines to supply values for assets, disability, and SSI variables to each record. See Kennell and Sheils (1990); Zedlewski (1990); see also aging.)

Filing Unit This term is given to the unit of analysis in microsimulation models of income support and tax programs; that is, the unit entitled to apply for benefits or obligated to file a tax return. Filing units differ within and among

Page 306 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

programs: for example, tax filing units may include a married couple and their dependent children, or a single person living alone or with others; AFDC filing units may include a single parent and her or his dependent children or a two-parent family in which the head is unemployed or disabled. Filing units often differ from families and households as defined in surveys: for example, an AFDC mother and her children may reside with other relatives who are not part of the filing unit. See U.S. House of Representatives (1990) for detailed information on filing unit definitions and other aspects of the eligibility rules for federal social insurance and public assistance programs.

Hot Deck See imputation.

Imputation Imputation is the process of assigning values to variables in a database record that are missing because of item nonresponse (nonresponse to a survey question or nontranscription of an item in an administrative records system) or because the variable was never included in the survey or administrative records system. See Chapter 5; Citro (in Volume II); and Madow, Olkin, and Rubin (1983).

Imputation procedures for item nonresponse range from very simple to very complex. A simple procedure is to impute the mean value for all people who responded to a particular item to all records that are missing the item. Slightly more complex variants are to impute a mean modified by a stochastic error term, or to impute means, with or without error terms, to categories of nonreporters. The Census Bureau uses very complex item nonresponse imputation methods for household surveys such as the CPS and SIPP, including the hot-deck method and what it refers to as statistical matching. Hot-deck methods assign a nearest neighbor value: that is, the data records are sorted by geographic area and processed sequentially, and reported values are used to update (''hot deck") matrices of characteristics. A record with a missing item has the most recently updated value assigned from the appropriate matrix (e.g., a matrix of earnings for people with specified demographic and occupational characteristics). The Census Bureau's statistical match procedure for item nonresponse (usually for whole groups of items) involves indexing the records by various characteristics that are available for both respondents and nonrespondents and searching for the respondent "donor" who best matches the nonrespondent "host."

Imputation procedures for items not collected also vary from simple to complex. A simple procedure is to impute a mean amount for a missing variable based on tabulations from another data source. A more complex procedure is to use another data source to estimate regression equations that include independent variables common to both sources. These equations are then run with the estimated coefficients and the values of the independent variables in the records requiring imputation in order to estimate values of the dependent variable to impute to the records with missing data.

Page 307 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

Macroeconomic Model This type of model is used to produce forecasts of the state of the economy, including such aggregates as GNP, inflation, and unemployment. These forecasts play a role in policy formation in both the public and private sectors. Macroeconomic models are also used to simulate the effects of proposed government fiscal and budgetary policy changes on economic aggregates. Macroeconomic models, which have been developed by private firms such as DRI, academic researchers, and government economists, are composed of systems of simultaneous equations estimated with aggregate time-series data. Analysts supply expected values for exogenous factors, including policy variables such as tax rates and government spending. The models then use these values as input to their systems of equations to determine the impact on economic aggregates such as GNP and personal consumption expenditures.

Macroeconomic models vary in size and complexity. They have been developed for personal computers as well as for large mainframes, and they can include relatively few to thousands of equations based on relatively few to many thousands of time series. They can also produce a small number of outputs for the national economy or many outputs, disaggregated by such attributes as industry and region. Their distinguishing characteristic is that they model the behavior of aggregates, such as all consumers in the nation or in a region or state, not individual decision units. See Klein (1986); Kraemer et al. (1987); McNees (1989, 1990).

Matching Matching is the process of appending entire records (or subsets of variables) from one or more donor files to a host file to obtain values for items not collected for the host file, a procedure that is generally used when large numbers of variables are involved. See Chapters 5 and 8; Cohen (Chapter 2 in Volume II); and Subcommittee on Matching Techniques (1980).

Exact matching uses a unique identifier common to the data sets being matched, such as social security number (SSN). Other common information, such as age and sex, is also typically used to validate the quality of a match.

Statistical matching is carried out on two or more data sets when they share variables in common—such as age, sex, and income—but lack a common unique identifier or come from nonoverlapping samples. In some cases, statistical matches have been performed when it was theoretically possible but not feasible—for confidentiality or other reasons—to carry out exact matches. Statistical matching is a complex procedure that classifies records in two files by variables that they share in common, then uses an algorithm to select the best match from the donor file for each host record and extracts variables from the donor file to attach to the host file records. Typically, the validity of a statistical match rests on the assumption of conditional independence, namely, that all of the information about the relationship between the variables that are

Page 308 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

unique to the donor file(s) and the variables that are unique to the host file is contained in the common set of variables.

Microsimulation Model This type of social welfare policy model operates at the individual decision unit level, which can be an individual or family for an income-support program model, a hospital for a medical care payment model, or a corporation for a corporate tax model. Microsimulation models essentially conduct program experiments (simulations) on large samples of microdata for individual decision units (see Beebout, 1986). In very general terms, the first step, which serves the same function as the control group for an experiment, is to prepare a baseline database representing the current situation, that is, the situation in the absence of a program change. The second step is to simulate the program change and its impact. The third step is to summarize the differences between the baseline and alternative program databases. Microsimulation models typically include routines to generate the database, routines to mimic the rules of government programs (accounting functions), and routines to produce tabulations of the simulation results (or output files for tabulations by another software package such as SAS). They may also include routines to simulate behavioral responses to proposed program changes. In simulating any type of behavior—whether demographic or economic behavior (such as marriage or job change) in aging a database using dynamic techniques, basic program participation behavior, or additional behavioral responses to program changes—microsimulation models are characterized by the use of probabilistic (Monte Carlo or stochastic) rather than deterministic techniques. For example, in implementing a program participation decision, the model draws a probability for each decision unit at random, compares that probability to an estimated participation probability for the particular type of unit, and, if the former probability is less (more) than the latter, designates the unit (not) to participate. See dynamic model and static model.

Second-Round Effects See behavioral response.

Static Model This term refers to microsimulation models that operate on a database representing a cross-section of the population at a given time. Such models typically simulate the direct effects of policy changes, assuming full implementation of the program changes without any feedback effects due to behavioral responses; however, they can also simulate behavioral responses to program changes. Static models can also be used to generate estimates for future years, for which they use static aging techniques to generate a cross-sectional database representing the baseline program in the future year, subsequently using the aged database to conduct simulations of program alternatives. See Beebout, 1980, 1986; Webb, Michel, and Bergsman, 1990; see also aging.

Weighting This is the process of assigning weights (factors) to observations in a sample survey so that the weighted count of all observations will approximate

Page 309 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×

the total population. In order to take account of features of the sample design and to attempt to minimize bias and variance in the weighted estimates, the weights for household surveys such as the CPS or SIPP typically represent the product of several factors, including: a factor for the probability of selection (this factor is the inverse of the sampling fraction); adjustment factors for household nonresponse; adjustment factors to reduce the variance among primary sampling units; and adjustment factors so that the weighted counts approximate estimates of the total civilian, noninstitutionalized population by age, race, Hispanic origin, and sex. The last set of adjustment factors is developed from the previous decennial census updated by vital records. See, for example, Bureau of the Census (1989a); Jabine, King, and Petroni (1990).

Page 310 Cite

Suggested Citation:"Appendix: Microsimulation Models, Databases, and Modeling Terms." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume I, Review and Recommendations. Washington, DC: The National Academies Press. doi: 10.17226/1835.

×