National Academies Press: OpenBook
« Previous: The EM-AF Statistical Match
Suggested Citation:"Merge File of the Office of Tax Analysis." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers. Washington, DC: The National Academies Press. doi: 10.17226/1853.
Page 69

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

STATISTICAL MATCHING AND MICROSIMULATION MODELS 69 with the results of the initial match. The dissatisfaction was primarily with estimates of numbers of recipients and aggregate amounts for several income types in the AF. The presence of several income types was given a larger role in the rematch in order to improve these estimates. Finally, Radner (1983:137) notes: “Because the EM sample…contains few high-income records…it was decided to add more AF records at $30,000 AGI (adjusted gross income) and above.” For the EM records, this match replaced the previous match. This statistical match was followed by several instances of controlling to various totals from a variety of sources using a variety of techniques, including the addition of recipients for certain income types, such as transfer income. Other changes were the additional inflating of amounts and audit corrections. The usual CPS weights were not used; instead, the weights were adjusted for the census undercount and for consistency with administrative control totals. The model that used the statistical match for input was not strictly a microsimulation model, since no program changes were simulated with this data set, just changes to income estimates for various subgroups. However, the adjustments used are typical of those used on microsimulation data sets and so are relevant to consider. Merge File of the Office of Tax Analysis The federal individual income tax form lacks certain types of information, including sources of income and types of expenditures not subject to tax under current law, links between taxpayers and families, and information on nonfilers. These limitations motivated the creation of the merge file in the Office of Tax Analysis of the Internal Revenue Service (IRS) (see Bristol [1988] and Cilke, Nelson, and Wyscarver [1988]; creation of the most recent merge file is described in detail in Chapter 8 of Volume I). The merge file represents a statistical match of about 60,000 sample households from the March CPS with about 90,000 tax returns obtained from the Statistics of Income (SOI) sample of IRS records. Prior to the match, several variables are imputed to the SOI records, such as itemized deductions for nonitemizers and the share of earnings attributable to husbands and wives. The CPS data are also modified in several ways, including the correction of certain types of income for nonreporting and underreporting. Before the statistical match is performed, the two files are “aligned” so that they represent the same universe. This process, a complicated form of reweighting, is not very successful because the CPS file has many low-income units and the SOI file is relatively rich in high-income units. For example (Bristol, 1988:118): In the 1981 merge, 300 “returns” in the highest income class of the CPS

Next: 1966 Merge File for Household Income Data »
Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers Get This Book
Buy Paperback | $100.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

This volume, second in the series, provides essential background material for policy analysts, researchers, statisticians, and others interested in the application of microsimulation techniques to develop estimates of the costs and population impacts of proposed changes in government policies ranging from welfare to retirement income to health care to taxes.

The material spans data inputs to models, design and computer implementation of models, validation of model outputs, and model documentation.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook,'s online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!