produce an underestimation or an overestimation of the magnitude of an association. Epidemiologic studies are subject to a variety of biases, and the primary challenge is to design a study as free of bias as possible. Bias may occur even if the utmost care is taken in designing a study. Researchers aware of the potential for bias can take steps to control it in the statistical analysis or interpretation of observed results. There are three main types of bias in epidemiologic research: selection bias, information (or misclassification) bias, and bias due to confounding.
Selection bias can occur in the recruitment of study subjects to a cohort. For example, in a retrospective cohort study, when the exposed and unexposed groups are selected differentially on outcome, the assembled cohort can differ from the target population with respect to the association between exposure to the agent under study and disease outcome. In cohort studies of industrial populations, selection bias can operate at the time of entry or during followup. Furthermore, comparison of rates in the general population with rates in occupationally exposed cohorts may be subject to what is called the healthy-worker effect. This arises when a population that is healthy enough to be employed experiences lower mortality than the general population, which comprises healthy and unhealthy people. The healthy-worker effect can result in an underestimation of the strength of an association between exposure to an agent and some effect or outcome by failing to compare populations with similar levels of health. To balance the influence of the healthy-worker effect, some investigators divide worker populations into groups based on their levels of exposure to the agent being studied. Comparisons are then made within the cohort, thereby minimizing the healthy-worker effect.
Selection bias is of particular concern in case-control studies because the selection of study subjects is based on disease status. If exposed people were more likely to agree to participate than unexposed people, the study would produce an overestimation of the association in question—unless such selection is also present in the controls. In case-control studies, strategies to reduce the effect of selection bias generally concern the choice of control group.
Selection bias can also influence the results of a study through the pattern of missing data. If data are missing (because of a lack of response by subjects or because of errors in coding) in a way that is related to both the exposure of interest and the outcome, selection bias can occur.
Information bias (also known as misclassification bias) can occur when there are errors in data-collection methods, such as imprecise measurement of exposure or outcome. This is of concern, for example, when death certificates are used as a source of information on causes of death that are not verified. Inaccurate coding can create unknown error. The error may be uniform across the entire study population, in which case it may tend to reduce the apparent magnitude of associations (“bias toward the null”), or it may show up differently between study groups. Information bias in relation to the outcome can result from incorrect assignment of study subjects with respect to the outcome variable; a case may be misclassified as a control, or a member of a cohort during followup who develops disease will not be diagnosed, or vice versa. Such errors in classification of disease status are of