. "6 Opportunities for Methodologic Advances in Data Analysis." Environmental Epidemiology, Volume 2: Use of the Gray Literature and Other Data in Environmental Epidemiology. Washington, DC: The National Academies Press, 1997.
The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
that have shown similar short-term reductions in lung function in response to ozone exposure in real-world situations. These data have come from studies of children in summer camps (Spektor et al., 1988; Lioy et al., 1985) and from a study of schoolchildren (Kinney et al., 1989). A reanalysis of the data from schools and the study of campers by Spektor et al. (1988) showed highly significant heterogeneity of response to ozone (Brunekreef et al., 1991). In contrast, short-term exposure to TSP was also associated with short-term reductions in lung function (Dockery et al., 1982; Dassen et al., 1986) but without evidence of heterogeneity of response to TSP (Brunekreef et al., 1991). The analytic method determined whether the variation in regression coefficients across subjects was greater than random, given the standard errors of subject-specific coefficients.
Similar results were noted in a study of respiratory symptoms. A panel study of asthmatics (Whittemore and Korn, 1980) found an association of exposure to TSP and ozone with increased respiratory symptoms, but no evidence of greater-than-random variability in the different TSP regression coefficients. In contrast, the ozone coefficients showed clear signs of heterogeneity of response.
Heterogeneity not only indicates sensitive subgroups, it affects estimation of the standard errors of regression coefficients, leading to improper hypothesis tests for pollution variables. Korn and Whittemore (1979) proposed a 2-stage method to address this issue. In the above-cited panel study of symptoms in asthmatics, they assumed that each subject's sequence of daily binary responses (with or without the symptom) followed a logistic model. However, instead of a common regression coefficient for air pollution, each subject was assumed to have a possibly unique regression coefficient. Although conceptually attractive, this approach requires sufficient data for asymptotic normality assumptions to hold. In fact, for consistency and asymptotic normality of the estimates, both the number of subjects and the number of days need to be large. Improvements to the Korn-Whittemore approach were given by Anderson and Aitkin (1985).
Population heterogeneity is an important statistical issue, even when there is no heterogeneity in response to the pollutant of interest. For logistic and Poisson models, a fixed relation between the mean and the variance of the distribution is generally assumed. There may be factors that alter the variation in the outcome and produce overdispersion or underdispersion. Either of these tendencies will result in incorrect standard-error estimates for regression coefficients, altering the probabilities of both type I and type II errors. McCullagh and Nelder (1983) discuss methods for estimating the overdispersion parameter in generalized linear models that include the logistic and Poisson regression settings.