National Academies Press: OpenBook
« Previous: Choosing the Matching Variables
Suggested Citation:"The EM-AF Statistical Match." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers. Washington, DC: The National Academies Press. doi: 10.17226/1853.
Page 68

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

STATISTICAL MATCHING AND MICROSIMULATION MODELS 68 for the common variables leads to reduced distortion in the joint distribution of (X, Z) on files created by matching. A related idea, proposed in Singh (1988), develops categorical variables X*, Y*, and Z*, related to X, Y, and Z, for which the conditional independence assumption is assumed to hold, and which are used to define equivalence classes for matching; for details, see Singh (1988). EXAMPLES OF STATISTICAL MATCHES IN MICROSIMULATION MODELS In this section I describe some applications of statistical matching, including the reasons for the match and the particular matching techniques used. The EM-AF Statistical Match It is well known that estimates of the distribution of family money income from household surveys contain serious bias. This bias can be reduced through the use of information from federal individual income tax returns. Radner (1983) describes a statistical match that begins with the March 1973 CPS-Internal Revenue Service- Social Security Administration exact match file (EM). This file was considered to have three limitations: (1) serious response errors in the CPS, (2) few high-income observations, and (3) not enough detail by income type. To address these limitations in the EM, it was statistically matched to the augmentation file (AF), a subsample of the 1972 Statistics of Income (SOI) sample of federal individual income tax returns that had been exact matched with Social Security Administration records containing earnings and demographic data. The EM-AF statistical match can be separated into three fairly distinct steps. First, there was an initial match, using 22 matching variables that included adjusted gross income, interest, dividends, and social security taxable earnings, sex, race, age, number of exemptions, and the use of various schedules. Certain of the characteristics were used to define cells within which distances between records were computed and outside of which no matches were permitted. These cells included an acceptable age range. The distance measure consisted of a s um of weighted discrepancies between the values for the 22 variables for the two files. The AF record that was closest to the EM record was chosen for the statistical match unless the minimal distance was greater than a specified maximum, in which case some cells were collapsed and the age range was eliminated Next, Radner (1983:137) describes: About 6,900 EM records that were considered to have an inconsistent initial match were rematched with the AF because we were not fully satisfied

Next: Merge File of the Office of Tax Analysis »
Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers Get This Book
Buy Paperback | $100.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

This volume, second in the series, provides essential background material for policy analysts, researchers, statisticians, and others interested in the application of microsimulation techniques to develop estimates of the costs and population impacts of proposed changes in government policies ranging from welfare to retirement income to health care to taxes.

The material spans data inputs to models, design and computer implementation of models, validation of model outputs, and model documentation.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook,'s online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!