Skip to main content

Currently Skimming:

6 Opportunities for Methodologic Advances in Data Analysis
Pages 130-153

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 130...
... These methodologic issues have motivated the development of new study designs and analytic methods. Valid characterization of the effects of environmental agents and assessment of dose-response relationships often require the application of multivariate statistical models to control for confounding and to evaluate interdependence of effects.
From page 131...
... In estimating the relative risk of disease associated with a particular environmental agent, the researcher may need to contend with multiple continuous and discrete variables, including the exposure of interest. For example, such demanding data are encountered frequently in studying effects of environmental agents on respiratory health.
From page 132...
... and linear-regression models generally assume that the outcome varies linearly with functions of risk factors, that the individual observations are statistically independent, and that random differences from the model all have the same distribution, although models are available that relax each of these assumptions. For example, if the outcome seems to be approximately lognormally distributed, the investigator may assume that the natural logarithm of the outcome varies (approximately)
From page 133...
... Confidence bounds and significance tests for the effects of environmental exposures may be biased by incorrect application of these or other models. ANALYSIS OF CORRELATED DATA Measurements related in time and/or space, such as repeated measurements of the same population at successive times or measurements of persons from nearby geographic areas, are likely to be correlated, and their error terms may not be independent.
From page 134...
... LONGITUDINAL DATA ANALYSIS AND SERIAL CORRELATION Longitudinal studies of the association between temporal variations in pollution and health outcomes have been useful in studying the health effects of outdoor air pollution. This design may also be informative in other areas of environmental epidemiology.
From page 135...
... The presence of a lagged dependent variable in a model with serial correlation in the errors is unattractive because the correlation between the predictor variable and the error term means that usual leastsquares regression estimates are biased and inconsistent. In these circumstances, the lagged dependent variable can be "instrumented." Instrumentation is the process of fitting a predictive model to a variable, using all possible predictors (except the hypothesis variable)
From page 136...
... A followup study in nonsymptomatic children has found similar results (Pope and Dockery, 1992~. Poisson models for mortality counts, hospital admissions, or emergency-room visits may also exhibit serial correlation, as may daily diaries of binary outcomes, such as the presence or absence of coughing or wheeze; these outcomes can be modeled with logistic or probit regressions.
From page 137...
... The period of measurement distinguishes between this correlation structure and the serial correlation described previously. If lung function were measured daily, there would undoubtedly be serial correlation in such data, in that each day's measurement is correlated with those of the preceding and following days.
From page 138...
... When some added random variability is associated with unknown or unmodeled factors, a hierarchic formulation may be used to create a moreflexible model. The hierarchy assigns additional levels of random variability to the unknown parameters (such as underlying mortality rates or response probabilities)
From page 139...
... and are known as "parametric empirical Bayes." Kass and Steffey (1989) refer to such a structure as a conditionally independent hierarchic model.
From page 140...
... McCullagh and Nelder (1983) discuss methods for estimating the overdispersion parameter in generalized linear models that include the logistic and Poisson regression settings.
From page 141...
... For example, elevations of the marker by other risk factors might increase the risk of coughing, but inflammatory responses to coughing might increase the level of the marker. Bootstrapping A statistical-computational innovation that has application in epidemiology is the use of resampling (Efron, 1982~.
From page 142...
... Particular success has been achieved for various forms of regression models (Faraway, 1990; Huet et al., 1990) , including generalized linear models (Mapleson, 1986; Rothe, 1989; Moulton and Zeger, 1991~.
From page 143...
... represents an alternative approach to nonparametric regression. This approach is equally valid for logistic and Poisson models and, indeed, for the entire family of generalized linear models.
From page 144...
... Generalized additive models can be used for hypothesis-testing and model selection. Alternatively, standard regression techniques can be used for model selection, and then the generalized additive model can be applied to the significant variables.
From page 145...
... This section describes 2 methods that are helpful in improving exposure estimates. KRIGING Exposure data in environmental-epidemiologic studies are often sparse, irregularly sampled, and not based on measurements coincident with subject locations.
From page 146...
... More formally, kriging is linear estimation that minimizes the mean square prediction error subject to unbiasedness conditions. Before using kriging, one typically conducts a structural analysis of the data to identify and remove trends and outliers and to estimate quantitatively the spatial structure or spatial covariance of the observed data (i.e., how the correlations process observations that are a function of distance between them)
From page 147...
... The regression coefficients from these approaches cannot be applied directly to exposure data from, e.g., central monitoring sites, to forecast effects. Thus, while these methods generally increase the power to detect an effect in the epidemiology study, they may complicate risk assessment.
From page 148...
... An understanding of the consequences of measurement error is particularly relevant to the quantitative estimation of the risk of disease associated with environmental agents for the purpose of policy development. Quantitative risk assessments may use exposure-response relationships derived from epidemiologic studies;
From page 149...
... Improvements in statistical analyses of both multiple exposures and multiple diseases or outcomes will enhance the role of environmental epidemiology in addressing small relative risks. Studies of environmental agents are likely to focus increasingly on multifactorial outcomes for which the exposure of interest accounts for a relatively small proportion of the variation in outcome.
From page 150...
... 1987. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping.
From page 151...
... 1989. Approximate Bayesian inference in conditionally independent hierarchical models (parametric empirical Bayes models)
From page 152...
... 1991. Estimation in generalized linear models with random effects.
From page 153...
... 1991. Generalized linear models with random effects: a Gibbs sampling approach.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.