Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
STATISTICAL MATCHING AND MICROSIMULATION MODELS 74 record with the larger weight and it was included as another potential record for matching. (For the precise algorithm used, see Springs and Beebout [1976].) STATISTICAL MATCHING: ADVANTAGES AND PROBLEMS The Advantages of Statistical Matching The greatest advantage of statistical matching in comparison with other techniques (mentioned below) is probably the great flexibility it provides to data users. As imputation provides data users with a rectangular data file that can be input directly into most statistical software packages, statistical matching creates a file on which a variety of analyses, often unanticipated, can be performed. Thus, if one would use iterative proportional fitting for some purposes, covariance matrices for another, etc., it does seem easier to simply create a statistically matched file, especially in those cases for which the analysis cannot be anticipated. If the conditional independence assumption is warranted, or is roughly valid, the creation of a statistically matched file is very convenient for most data users and one that should provide reasonable results. Statistical matching also allows considerable reduction in respondent burden and reduces the opportunity for data disclosure. Proble ms Associated With Statistical Matching Conditional Independence As pointed out by Sims (1972), statistical matching assumes that Y and Z, given X, are independent. Records from the two files are matched or not matched on the basis of the values of X(A) and X(B). Therefore, there is no additional information in the matched file about the relationship between Y and Z that is not explained by the relationships between X and Y and between X and Z. That is, the approach assumes that if one were to regress a Yi on X(A) and Z, and then regress Yi on X(A), the multiple correlations in the two regressions would be identical. Technically speaking, the procedure assumes that Y conditioned on X and Z conditioned on X are independent, or that the partial correlation between a Yi given X(A) and a Zj given X(B) is equal to 0 (which are equivalent notions if one assumes multivariate normality). It is important at this point to consider the mathematical definition of conditional independence. The partial correlation between Yi and Z j conditioned on X is equal to