Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
STATISTICAL MATCHING AND MICROSIMULATION MODELS 80 parameters, the weight that a triplet is assigned must equal the reciprocal of the probability of that triplet occurring. The triplet will appear with probability wAiâ1 from file A and with probability w Bj 1 from file B. Thus, the weight that this record should get is the inverse of its probability of occurring, which is 1/(wAiâ1+wBjâ1). Using these weights assures that every estimate of the form will be an unbiased estimate. The weights w ABj do not necessarily add to n. This may seem a desirable property of the weights, and in that case we define The most important feature of Rubin's approach is multiple imputation. Multiple imputation is used to assess the variability of the inference or estimation with respect to the imputation process. The variability can be thought of as having two sources, variability due to choice of imputation model, and variability due to imputation given the imputation model. Variability due to imputation is addressed by determining the k data points with the k nearest-to-the-fitted values as potential imputations, rather than simply the closest. Then, to create a number of imputed files, one randomly chooses one of the k to match to each record. The variability due to imputation is then measured by alternately using each concatenated file for analysis. Variability with respect to the imputation model used, here discussed as some sort of regression model, can also be weakly assessed through a type of sensitivity analysis. An essential example of this is the assumption that the partial correlation between Y and Z given X is equal to 0. One could begin by performing several imputations with the assumption that ÏYZ.X equals 0. In addition, one could assume that ÏYZ.X is equal to, say, .5. Then, rather than regress Y on X and Z on X to determine the nearest-to-the-fitted values, one could regress Y on X and Z, and Z on X and Y, since now the entire covariance matrix of Y, X, Z is specified. Then several imputations could again be performed with this new assumption. The variance due to model selection could then be assessed by comparing the results to those obtained when the assumption that ÏYZ.X equals 0 is made. Rough Sensitivity Analysis There is a very close relative to Rubin's procedure that has the advantage of some computational simplicity. This procedure could be used to shed some light on the sensitivity of the analysis to the failure of the conditional independence assumption. The discussion focuses on the case of unconstrained statistical