A
Statistical Details

MIXTURE MODELS

One can naturally model the unobserved population heterogeneity or extra-population heterogeneity via mixture models. The simplest and most natural occurrence of the mixture model arises when one samples from a super population g which is a mixture of a finite number, say m, of populations (g1,…, gm), called the components of the population. Suppose a sample from a super population g is recorded as data (Yi, J i) for i = 1, …, n, where Yi = yi is a measurement on the ith sampled unit and Ji = ji indicates the index number of the component to which the unit belongs. If sampling is done from the jth component, an appropriate probability model for the sampling distribution is given by

P(Yi = yi|Ji = j) = fj(yi; θj)

The function fj(yi; θj) represents a density function, called the component density for the ith observation yi and the parameter θj Θjdj called the component parameter, that describes the specific attributes of the jth component population, gj.

We treat the Ji as missing data and define the latent random variable Φ= (Ф1,…, Фn) to be the values of the parameter θ 1, …,θm} corresponding to the sampled components J1, …, Jn; i.e., if the ith observation came from the jth component, then define Фi = θj. Then the Фi are a random sample from the discrete probability measure Q that assigns a positive probability πj to the support point θj for j = 1, …, m. That is, the latent



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement