National Academies Press: OpenBook
« Previous: Iterative Proportional Fitting
Suggested Citation:"More Data Collection." National Research Council. 1991. Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers. Washington, DC: The National Academies Press. doi: 10.17226/1853.
Page 78

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

STATISTICAL MATCHING AND MICROSIMULATION MODELS 78 iterative proportional fitting may provide an alternative to statistical matching. Suppose that in the recent past a survey did collect information on all needed variables. However, more recent data collection efforts have only updated the marginal information about certain variables but not information about their joint distribution. Iterative proportional fitting could then make use of the more recent marginal information to update the older information on the joint distribution. This procedure successively modifies the frequency counts in the relevant k- way table one dimension at a time to bring the marginal totals of the contingency table into agreement with the newer marginal totals until a modified contingency table exists with the updated marginals. Iterative proportional fitting therefore retains some of the joint distributional structure present in the original contingency table. (For a good reference to iterative proportional fitting see Bishop, Fienberg, and Holland, 1975.) There are at least two advantages of iterative proportional fitting in comparison with statistical matching of individual files from the newer surveys: the statistical match generally will require more computation; the statistical match, as typically accomplished, will ignore the information about the joint distribution present in the older comprehensive survey. Paass (1988) presents a new algorithm that has advantages over iterative proportional fitting when the table has a large number of dimensions. More Data Colle ction In order to avoid the need to assume that Y and Z are conditionally independent given X, in some situations it ma y be possible to collect data on a small subset of individuals—a subset that is in some sense representative of the entire data set—and then directly estimate the amount of conditional dependence. Such estimates of conditional dependence could then be used to direct the statistical matching process. Suppose one collected data on a special survey of 500 individuals, a training data set, enabling the rough estimation of V(Y,Z). Then, one would add the following (additional) constraints into the statistical match: where the left-hand term was computed from the small study, and the right-hand term was a function of the two large samples. The computation of V(Y,Z|X) involves wij, the weight given to matching the ith record from file A to the jth record from file B. Clearly, this last constraint is considerably nonlinear in w ij, which would greatly increase the computational complexity of the algorithm, both constrained and previously unconstrained. While this procedure has many advantages, including the ability to retain many of the benefits of the statistical match with respect to increased disclosure avoidance and reduced respondent burden, the variability of the estimate of

Next: Multiple Matching and File Concatenation »
Improving Information for Social Policy Decisions -- The Uses of Microsimulation Modeling: Volume II, Technical Papers Get This Book
Buy Paperback | $100.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

This volume, second in the series, provides essential background material for policy analysts, researchers, statisticians, and others interested in the application of microsimulation techniques to develop estimates of the costs and population impacts of proposed changes in government policies ranging from welfare to retirement income to health care to taxes.

The material spans data inputs to models, design and computer implementation of models, validation of model outputs, and model documentation.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook,'s online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!