Background for Epidemiologic Methods
Epidemiology is the study of the distribution and determinants of disease prevalence in man (MacMahon and others 1960). Epidemiologists seek to describe the populations at risk and to discover the causes of diseases. This entails quantification of the risk of disease and its relationship to known or suspected causal factors. In radiation epidemiology, exposure to radiation is the factor of primary interest, and epidemiologists seek to relate risk of disease (primarily cancer) to different levels and patterns of radiation exposure. Epidemiologic studies have been of particular importance in assessing the potential human health risks associated with radiation exposure.1
As part of the study of the causes of disease, epidemiologists measure factors that are suspected of leading to its development. A basic comparison used in radiation epidemiology is to measure the rate of a specific disease among persons who have been exposed to radiation and among persons who have not. The two rates are compared to assess whether they are similar or are different. A logical extension of this basic mode of comparison is to stratify the exposed subjects on the basis of amount (dose) of radiation in order to assess whether disease rates vary with dose, that is, whether there is a dose-response relationship.
If the rates of a disease are essentially the same in the exposed and unexposed groups, there is said to be no association between radiation exposure and disease. This does not necessarily mean that in all populations at all times, radiation is not related to the disease, but it does mean that in this population at this time, sufficient evidence does not exist for an association between radiation and disease. If the disease rate is higher among those exposed to radiation, there is a positive association. If the disease rate is higher among the unexposed group, there is a negative (inverse) association between radiation exposure and disease.
Epidemiologists use the term “risk” in two different ways to describe the associations that are noted in data. Relative risk is the ratio of the rate of disease among groups having some risk factor, such as radiation, divided by the rate among a group not having that factor. Relative risk has no units (e.g., 75 deaths per 100,000 population per year ÷ 25 deaths per 100,000 per year = 3.0). Excess relative risk (ERR) is the relative risk minus 1.0 (e.g., 3.0 − 1.0 = 2.0). Absolute risk is the simple rate of disease among a population (e.g., 75 per 100,000 population per year among the exposed or 25 per 100,000 per year among the nonexposed). Absolute risk has the units of the rates being compared. Excess absolute risk (EAR) is the difference between two absolute risks (e.g., (75 per 100,000 per year) − (25 per 100,000 per year) = 50 per 100,000 per year). If the rates of disease differ in the exposed and unexposed groups, there is said to be an association between exposure and disease. None of these measures of risk is sufficient to infer causation. A second step in data analysis is necessary to assess whether or not the risk factor is simply a covariate of a more likely cause.
In modeling the relation between radiation exposure and disease, either the ERR or the EAR may be used. In addition, the estimated dose of radiation exposure is integrated into the models, so that estimation is made of the ERR or EAR as a function of dose. Relative risk and ERR have certain mathematical and statistical advantages and may be easier to understand for small risks, but absolute risk and EAR are more closely related to the burden of disease and to its impact on the population. Thus, each type of measure has its advantages, and each is used in this report.
Having assessed whether or not there is evidence of an association between radiation exposure and a disease in the population of interest, the next task of the epidemiologist is to assess whether noncausal factors may have contributed to the association. An association might not represent a causal link between radiation and disease, but rather could be due to chance, bias, or error. It should be noted that chance can never be ruled out as one possible explanation for an asso-
See Glossary for definition of specific epidemiologic terms.
ciation that is observed in epidemiologic data, although the probability may be extremely small.
Having judged that an association in a population under study cannot be demonstrated to have occurred because of error or bias, an investigator computes a measure of association that takes into account any relevant differences between the exposed and the unexposed group. Also it is usual to quantify the uncertainty in a measured association by calculating an interval of possible values for the true measure of association. This confidence interval describes the range of values most likely to include the true measure of association if the statistical model is correct. It always is possible that the true association lies outside the confidence interval either because the model is incomplete or otherwise in error or because a rare event has occurred (with rare defined by the probability level, commonly 5%).
Another step in assessing whether radiation exposure may be the cause of some disease is to compare the results of a number of studies that have been conducted on populations that have been exposed to radiation. If a general pattern of a positive association between radiation exposure and a disease can be demonstrated in several populations and if these associations are judged not to be due to confounding, bias, chance, or error, a conclusion of a causal association is strengthened. However, if studies in several populations provide inconsistent results and no reason for the inconsistency is apparent, the data must be interpreted with caution. No general conclusion can be made that the exposure is a cause of the disease.
An important exercise is assessing the relation between the dose of exposure and the risk of disease. There is no question that radiation exposure at relatively high doses has caused disease and death (NRC 1990; UNSCEAR 2000b). However, at relatively low doses, there is still uncertainty as to whether there is an association between radiation and disease, and if there is an association, there is uncertainty about whether it is causal or not.
Following is a discussion of the basic elements of how epidemiologists collect, analyze, and interpret data. The essential feature of data collection, analysis, and interpretation in any science is comparability. The subpopulations under study must be comparable, the methods used to measure exposure to radiation and to measure disease must be comparable, the analytic techniques must ensure comparability, and the interpretation of the results of several studies must be based on comparable data.
COLLECTION OF EPIDEMIOLOGIC DATA
Types of Epidemiologic Studies
Research studies are often classified as experimental or observational depending on the manner in which the levels of the explanatory factors are determined. When the levels of at least one explanatory factor are under the control of the investigator, the study is said to be experimental. An example is a clinical trial designed to assess the utility of some treatment (e.g., radiation therapy). When the levels of all explanatory factors are determined by observation only, the study is observational. If treatment is assigned by a random process, the study is experimental. The majority of studies relevant to the evaluation of radiation risks in human populations are observational. For example, in the study of atomic bomb survivors, neither the conditions of exposure nor the levels of exposure to radiation were determined by design.
Two basic strategies are used to select participants in an observational epidemiologic study that assesses the association between exposure to radiation and disease: select exposed persons and look at subsequent occurrence of disease, or select diseased persons and look at their history of exposures. A study comparing disease rates among exposed and unexposed persons, in which exposure is not determined by design, is termed a “cohort” or a “follow-up” study. A study comparing exposure among persons with a disease of interest and persons without the disease of interest is termed a “case-control” or “case-referent” study.
Randomized Intervention Trials
Intervention trials are always prospective—for example, subjects with some disease are enrolled into the study, and assignment is made to some form of treatment according to a process that is not related to the basic characteristics of the individual patient (Fisher and others 1985). In essence, this assignment is made randomly so that the two groups being studied are comparable except for the treatment being evaluated. Random is not the same as haphazard; a randomizing device must be used, such as a table of random numbers, a coin toss, or a randomizing computer program. However, random assignment does not guarantee comparability. The randomization process is a powerful means of minimizing systematic differences between two groups (“confounding bias”) that may be related to possible differences in the outcome of interest such as a specific disease. Further, blinded assessment of health outcome will tend to minimize bias in assessing the utility of alternative methods of treatment. Another important aspect of randomization is that it permits the assessment of uncertainty in the data, generally as p-values or confidence intervals. Intervention trials related to radiation exposure are conducted with the expectation that the radiation will assist in curing some disease. However, there may be the unintended side effect of increasing the risk of some other disease.
Although a randomized study is generally regarded as the ideal design to assess the possible causal relationship between radiation and some disease in a human population, there are clearly ethical and practical limitations in its conduct. There must be the expectation that in the population under study, radiation will lead to an improvement in health
status relative to any alternative treatment. Such studies are usually conducted with patients who need therapeutic intervention; randomly selected patients may be treated with radiation and some other form of treatment or with different types or doses of radiation. In these trials the sample size is relatively small and the follow-up time is relatively short. Therefore, most studies to assess the long-term adverse outcomes of exposure to therapeutic radiation, are, of necessity cohort studies.
Cohort studies may be retrospective or prospective. In a retrospective cohort study of a population exposed to radiation, participants are selected on the basis of existing records such as those maintained by a company or a hospital (e.g., radiation badge records). These records were made out at the time an individual was working or treated and thus may be used as the historical basis for classification as a member of the exposed cohort. In a prospective cohort study, participants are selected on the basis of current and expected future exposure to radiation, and exposure information is measured and recorded as time passes. In both types of cohort study, the members of the study population are followed in time for a period of years, and the occurrence of new disease is measured. In a retrospective cohort study, the follow-up has already occurred, while in a prospective cohort study, the follow-up extends into the future. Many studies that are initiated as retrospective cohort studies become prospective as time passes and follow-up is extended.
The information available in a retrospective cohort study is usually limited to what is available from the written record. In general, members of the cohort are not contacted directly, and information on radiation exposure and disease must come from other sources. Typically, information on exposure comes from records that indicate the nature and amount of exposure that was accumulated by a worker or by a patient. On occasion, all that is available is the fact of exposure, and the actual dose may be estimated based on knowledge of items such as the X-ray equipment used (Boice and others 1978).
Information on disease also must come from records such as medical records, insurance records, or vital statistics. Cancer mortality is readily evaluated by retrospective cohort studies, because cancer registries exist in a number of countries or states and death from cancer is fairly reliably recorded.
Most studies that have followed patients treated with therapeutic radiation are retrospective cohort studies. Series of patients are assembled from medical and radiotherapy records, and initial follow-up is done from the date of therapy until some arbitrary end of follow-up. Patients treated as long ago as the 1910s have been studied to assess the long-term effects of radiation therapy (Pettersson and others 1985; Wong and others 1997a).
The information available in a prospective cohort study is potentially much greater than that available in a retrospective cohort study. Exposure is contemporaneous and may be measured forward in time, and members of the cohort may be contacted periodically to assess the development of any new disease. Direct evaluation of both exposure and disease may be done on an individual basis, with less likelihood of missing or incomplete information due to abstracting records compiled for a different purpose.
The follow-up of survivors of the Japanese atomic bomb explosions is largely prospective, although follow-up did not begin until 1950 (Pierce and others 1996). Exposure assessment was retrospective and was not based on any actual measurement of radiation exposure to individuals. Reconstruction of the dose of radiation exposure is an important characteristic of this study, and improvements in dose estimation continue to the present with a major revision of the dosimetry published in early 2005 (DS02).
The primary advantage of a retrospective cohort study is that time is compressed. If one wishes to evaluate whether radiation causes some disease 20–40 years after exposure, a retrospective study can be completed in several years rather than in several decades. The primary disadvantage of a retrospective cohort study is that limited information is available on both radiation exposure and disease. The primary advantage of a prospective cohort study is that radiation exposure and disease can be measured directly. The primary disadvantage is that time must pass for disease to develop. This leads to delay and expense. Most studies in radiation epidemiology are retrospective cohort studies.
Case-control studies may be prospective or retrospective. The cases are those individuals with the disease being studied. Cases in a retrospective case-control study are usually selected on the basis of existing hospital or clinic records (i.e., the cases are “prevalent”). In a prospective case-control study, the cases are “incident,” that is, they are selected at the time their disease was first diagnosed. Controls are usually nondiseased members of the general population, although they can be persons with other diseases, family members, neighbors, or others.
After the cases and controls have been identified, it is necessary to determine which members of the study population have been exposed to radiation. Usually, this information is obtained from interviewing the cases and the controls. However, if the case or control is deceased or unable to respond, exposure information may come from a relative or from another proxy.
The information available in case-control studies usually is less reliable than that collected in cohort studies. For example, consider the accuracy of dietary history for the past year versus that of a year from several decades in the past. Exposure information may be available only from interview
of the study subjects and therefore be less reliable than reliance on contemporary records. There may be differential recall of exposure to radiation depending on case or control status, which leads to a lack of comparability in the information available. It is rare to be able to quantify the amount of past exposure in a case-control study. However, in some situations related to radiation exposure, only data from case-control studies are available.
The critical differences between a retrospective cohort study and a case-control study are that subjects in the former are selected on the basis of exposure category at the start of the follow-up period and exposure measures are concurrent with the actual exposure. Conversely, in a case-control study, subjects and controls are selected on the basis of disease outcome, and past exposures must be reconstructed.
On occasion in epidemiology, a hybrid study is performed: the “nested” case-control study. A cohort study is conducted, and subsequently, additional information on exposure is collected for persons with disease and for a sample of persons without disease. For example, radiation exposure among persons with a second cancer may be compared to that among a sample of those without a second cancer. Nested case-control studies are best thought of as a form of retrospective cohort study, in that the study population is initially defined on the basis of exposure rather than of disease.
In evaluation of the possible health effects of exposure to ionizing radiation, many of the informative case-control studies have been nested within cohorts. Exposure measures in these studies are generally not based on interview data, but rather on review of available records, sometimes supplemented by extensive modeling and calculations. In some nested studies, the objective is to obtain information on dose or other factors that would be too expensive to obtain for the entire cohort. Examples are a case-control study of selected cancers in women irradiated for cervical cancer to obtain individual dose estimates (Boice and others 1985); a breast cancer study of A-bomb survivors to obtain data on reproductive factors through interview (Land and others 1994b); and a study of lung cancer in Hanford workers to extract smoking histories from medical records (Petersen and others 1990).
Comparability in Study Design
The design of an epidemiologic study must assume comparability in the selection of study participants, comparability in the collection of exposure and disease information relevant to each study subject, and comparability of the basic characteristics of the study subjects. Any lack of comparability may undermine inferences about an association between exposure and disease, so that interpretation is ambiguous or impossible.
Comparability in a clinical trial ordinarily is straightforward, because study subjects are assigned randomly to the various forms of treatment being evaluated. Random assignment prevents selection on the basis of outcome and provides the optimum strategy for minimizing differences between the two groups being studied. Comparability in a cohort study means that subjects exposed to radiation and unexposed subjects are enrolled without knowledge of disease status, that information on disease is obtained without knowledge of exposure status, and that other factors related to disease occurrence are not related to exposure status.
Lack of comparability in any of these epidemiologic study designs may lead to one or another form of bias, which in turn may minimize or invalidate any information contained in the data from the study. Three common and potentially serious forms of bias are selection bias, when enrollment into a study is dependent on both radiation exposure and disease status; information bias, when information on disease or on radiation exposure is obtained differentially from exposed or from diseased persons; and confounding bias, when a third factor exists that is related to both radiation exposure and disease effects.
Selection bias is generally a minor issue in clinical trials and cohort studies, including retrospective cohort studies. In a prospective cohort study, disease has not yet occurred, so there is little possibility of selecting exposed persons on the basis of their future disease status. Exceptions are rare and limited to situations in which some preclinical sign or symptom affects selection—for example, when persons volunteer for one or another intervention because they know that they are at special risk.
By contrast, selection bias can be a major issue in case-control studies, because both exposure and disease already have occurred when the study subjects are enrolled; there is the danger that persons who are both exposed and diseased will be overselected to participate in the study. If this occurs, the data contain invalid information on the true relation between exposure and disease. Self-selection (volunteering) for a nonexperimental study can be a particularly potent source of bias.
An example of selection bias occurred in a study of leukemia among workers at the Portsmouth, New Hampshire, Naval Shipyard (Najarian and Colton 1978). In an initial case-control study, persons with leukemia who had been occupationally exposed to radiation were widely known and hence more likely to be located and enrolled than were unexposed workers with leukemia, and a positive association between radiation and leukemia was reported. Subsequently, after an extensive follow-up of all members of the workforce, no association between radiation exposure and leukemia was found (Greenberg and others 1985). The initial preferential selection of diseased workers who were exposed to radiation led to an erroneous appearance of a positive association between radiation and leukemia.
Information bias may occur in a clinical trial or a cohort study if knowledge of exposure is available when information on disease is being obtained; there is the possibility that
disease will be diagnosed more among exposed persons than among nonexposed persons. For this reason, in obtaining information on disease among participants, information on exposure is kept hidden (blinded), so that any error in disease ascertainment occurs equally among exposed and unexposed persons.
Information bias is a major threat in a case-control study if knowledge of disease is available when information on exposure is being obtained; there is a possibility that exposure will be ascertained more among diseased persons than among nondiseased persons. For this reason, in obtaining information on exposure among participants, information on disease is kept hidden from the interviewer and, if possible, from the respondent (blinded), so that any error in exposure ascertainment occurs equally among diseased and nondiseased persons. Further protection against information bias may come from blinding subjects and/or interviewers to the hypothesis under study.
Information bias as well as selection bias affected the Portsmouth Shipyard Study (Najarian and Colton 1978). In the initial case-control study, information on radiation exposure was obtained by interview of relatives of workers with and without leukemia. Subsequently, it was found that relatives of those with leukemia tended to overreport radiation exposure, whereas relatives of those without leukemia tended to underreport exposure (Greenberg and others 1985).
Confounding bias is a basic issue in all epidemiologic studies where no random assignment of exposure has occurred; this is the usual situation except for randomized clinical trials. No one type of nonexperimental epidemiologic study is inherently more subject to confounding bias. If information is available on each factor that is suspected of being a confounder, confounding bias may be minimized in a study design by matching on the relevant factors or in data analysis by stratification or statistical adjustment. However, if some confounding factor has not been measured, the data may be wrong. Thus, interpretation of the data must take into account the possible influence of potential confounding. Confounding bias is especially troublesome when the association under investigation is weak. In this case, a confounder has the potential to mask an association completely or to create an apparent effect. Because the risks associated with low levels of ionizing radiation are small, confounding bias is potentially important in low-level radiation studies.
A third factor (other than exposure and disease) can be confounding only when it is associated with both the exposure and the disease. Association only with exposure or only with disease is not sufficient for a factor to be confounding.
The so-called healthy worker effect is an example of confounding in studies of mortality among occupational groups, including those employed in the nuclear industry (Monson 1990). Ordinarily, persons who enter the workforce are healthy, and if mortality among workers is compared to that among the general population, the workers are found to be at a relatively low risk. If all members of the workforce were exposed to radiation, one interpretation would be that radiation reduces the risk of death.
In a clinical trial, assignment to a type of specific exposure is ordinarily a random process so that, on average, the two groups being compared are comparable with respect to possible confounding factors. Thus, in a randomized trial, confounding—although possible—is less of a concern than in a cohort or a case-control study.
An important part of any epidemiologic study is its statistical power (i.e., the probability that under the assumptions and conditions implicit in the model, it will detect a given level of elevated risk with a specific degree of significance). The power of a cohort study will depend on the size of the cohort, the length of follow-up, the baseline rates for the disease under investigation, and the distribution of doses within the cohort, as well as the magnitude of the elevated risk. Similarly, statistical power in a case-control study depends on the number of cases, the number of controls per case, the frequency and level of exposure, and the magnitude of the exposure effect. Statistical power is generally evaluated before a study is conducted. Afterwards it is more useful to refer to statistical precision, which is reflected in the width of the confidence intervals for risk estimates (UNSCEAR 2000b).
ANALYSIS OF EPIDEMIOLOGIC DATA
The basic data collected in an epidemiologic study are data on exposure and data on disease. In the simplest form, an individual may be exposed or not and may be diseased or not. Thus, there are four possibilities: exposed and diseased, exposed and not diseased, not exposed and diseased, or not exposed and not diseased. Typically, these data are entered into a “fourfold table” (Table 5-1).
It can be seen that in a study of N individuals, a + b are exposed, a + c are diseased, and a are both exposed and diseased. Interest is generally focused on whether a is larger than expected in relation to the other entries. Mathematically this is the same as asking whether d is larger than expected, or whether b or c are smaller than expected. Accurate counts in all four cells are necessary for valid inferences
TABLE 5-1 The Fourfold Table
a + b
c + d
a + c
b + d
about whether the disease is associated with the exposure. The rate of disease among the exposed subjects (Re) is equal to a/(a + b), and the rate of disease among the unexposed subjects (Rn) is equal to c/(c + d).
Measures of Association
Two measures are commonly used to compare the disease rates between exposed and unexposed subjects. The relative risk (RR) is the ratio of the two rates; that is, RR = Re/Rn. The ERR is given by ERR = RR − 1 = Re/Rn − 1 = (Re − Rn)/Rn. These ratios are dimensionless. The rates can also be subtracted rather than divided. The difference between Re and Rn, that is, Re − Rn, is termed the “attributable risk,” or “risk difference.” It is also referred to as the excess risk (ER) or the EAR, with the latter terminology commonly used in radiation epidemiology. The ER and EAR are often expressed as the number of excess cases or deaths per person-year (PY) or, for convenience, per 1000 PY.
In radiation studies, information on radiation dose is often available. Either of the measures, ERR or EAR, can be expressed per unit of radiation dose. In the simplest situation, one has exposed and unexposed groups and information on the average dose D received by exposed subjects. The ERR coefficient is then defined as
and absolute risk coefficient is defined as
where PY is the number of person-years of follow-up.
Both measures may depend on variables such as sex, age at exposure, time since exposure, and age at risk (attained age). The ERR expresses risk and its dependencies relative to risk in the unexposed, whereas the EAR expresses risk and its dependencies independent of risk in the unexposed. The RR (or ERR) has certain statistical advantages and is the more commonly used measure for epidemiologic studies, especially etiologic studies. The EAR is a useful measure for estimating the burden of risk in a population, including the dependence of this burden on various factors. Both measures can be used to estimate absolute lifetime risk as discussed in Chapters 11 and 12.
In some of the more informative radiation studies, dose estimates for individual subjects are available. In this case, more complex statistical regression methods are used to estimate the ERR and EAR per unit of radiation dose based on the assumption of a linear dose-response. These methods have been used in analyses of data on Japanese A-bomb survivors and on some medically exposed populations. The reader should consult Chapters 6 and 7 for further discussion of this approach.
Instead of categorizing persons with radiation exposure as simply being exposed or not, subjects may be categorized as having high, medium, or no exposure. In this case, there would be a sixfold table—three rows and two columns. Such data are of value in assessing whether or not there is a dose-response relationship between radiation exposure and disease. If the rate of disease is highest among the most exposed, intermediate in the middle exposure group, and lowest among those with no exposure, a dose-response relationship exists. In this report, only data that are of utility to a quantitative assessment of a dose-response relationship between radiation exposure and disease are included.
For radiation, we are generally interested in going beyond just deciding if there is a causal relationship. An important strength of radiation epidemiology is the availability of quantitative information on dose. Only by relating effects to dose can results be compared across studies or used to predict risks from exposures in other settings.
Tools of Statistical Inference
The second task in data analysis is assessing the statistical precision of an ERR or other measure of association calculated from data. Statistical estimates calculated from data are imprecise, or variable, in the sense that replication of the study (with identical conditions of exposure and levels of exposure, but with a different random sample of subjects) would likely result in a different estimate of risk. Thus, it is important to determine whether the actual observed association (e.g., an RR different from 1.0) can be explained by chance (random variation) alone. In epidemiologic studies the assessment of precision is usually accomplished via the calculation of p-values or confidence intervals.
The validity of both p-values and confidence limits rests on many assumptions about the study design and the data. Statistical results are often most correct when deviations from the assumptions are small, that is, the procedures are “robust.” It is the task of the investigator and any subsequent analyst to know the assumptions and to ensure that they are sufficiently close to reality.
Consider a hypothetical replication of the study in which the true RR is 1.0 (i.e., disease outcome is not related to exposure). The ERR from the hypothetical replication will not equal 1.0 exactly, but will vary randomly around the true value of 1.0. The p-value of the actual study is the probability that the RR estimated from the hypothetical data is more extreme in its difference from 1.0 (in either direction) than the RR estimated from the actual sample. A small p-value means that it is unlikely that the actual RR was calculated from data having a true RR of 1.0. In other words, a small p-value provides evidence that the true RR is different from 1.0; the smaller the p-value, the stronger is the evidence.
The confidence interval and p-value are based on the same theory; they use the theory in slightly different ways to answer slightly different questions. A p-value is appropriate
for answering a confirmatory question such as, Is 1.0 a believable value of RR? A confidence interval is appropriate for answering an exploratory question, such as, What are the believable values of RR? Obviously, a confidence interval lends partial information to the confirmatory question since values not in the 95% confidence interval are “rejected” at the significance level of 0.05. The p-value does add additional information, however, since it provides a degree of evidence. For example, p-values of .049 and .00000049 provide quite different measures of the believability of the hypothesis (of RR equal to 1.0, say), even though the 95% confidence interval excludes 1.0 in both cases.
Statistical precision is determined largely by study size (number of subjects). Larger studies generally result in more precise estimates. Small effects (RRs near 1.0) are generally more difficult to detect than large effects, because a confidence interval centered close to 1.0 is likely to include 1.0 unless the sampling variance is small. One consequence is that very large studies are required to estimate small effects precisely. This explains in part why risk models cannot be based exclusively on low-dose studies. The RRs associated with low doses are close to 1.0 and thus can be estimated precisely only in very large studies.
Control of Confounding
The third task in data analysis is to assess whether or not the crude association that is observed in a study is due to confounding by one or more other factors. For example, in assessing the relation between radiation and lung cancer, one should consider whether cigarette smoking is a confounding factor. Cigarette smoking is a recognized cause of lung cancer, and thus there is an association between smoking and lung cancer. If persons who are exposed to radiation, such as uranium miners, smoke more than persons who are not exposed, they may have an increased risk of lung cancer just from the smoking. Thus, unless the analysis deals with smoking as well as radiation, it is possible that an association between radiation and lung cancer seen in data only reflects the confounding influence of cigarette smoking.
In data analysis, the simplest way to assess whether or not confounding is present is to stratify on the confounding factor. That is, two fourfold tables are set up that relate the exposure (radiation) to the disease (lung cancer). If it is assumed that all smokers smoke the same, one table contains data only for smokers and a second table contains data only for nonsmokers. Within each of these two tables, no confounding by smoking is possible.
If it is necessary to control more than one confounding factor in the analysis of epidemiologic data, it is usual to construct a multivariate model relating exposure to disease and controlling for the potential confounding effect of a number of other factors. For example, sex and age are two factors that are commonly included in multivariate models. Such modeling is similar to stratification on a number of confounders and summarizing results in a standardized RR with associated confidence interval.
Linear Relative Risk Model
A model that plays a prominent role in radiation epidemiology studies is one in which the RR is a linear function of dose. In its simplest form,
where D is dose, RR(D) is the relative risk at dose D, and β is the ERR per unit of dose, which is usually expressed in grays or sieverts. In more complex forms, β is allowed to depend on gender, age at diagnosis, and other variables.
This linear RR model has been used extensively in radiation epidemiology, including studies of A-bomb survivors (Chapter 6), persons exposed for medical reasons (Chapter 7), and nuclear workers (Chapter 8). The model has served as the basis of cancer risk estimation by three BEIR committees (NRC 1988, 1990, 1999), by the 2000 UNSCEAR committee (2000b), and by the National Institutes of Health (NIH 2003). It also plays an important role in developing the BEIR VII committee’s cancer risk estimates (Chapter 12). The linear model has been chosen because it is supported by radiobiological models (Chapter 2) and because it fits the data from most studies (although in many studies, statistical power is inadequate to distinguish among different dose-response functions).
In the simplest situation, in which one has exposed and unexposed groups and information on the average dose D received by exposed subjects, β is estimated by (Re − Rn)/(RnD) as discussed earlier. In many radiation studies, however, doses for individual subjects are available and more complex estimation procedures are required to make use of this information. Preston and colleagues (1991) have developed the EPICURE software that allows for flexible modeling of both relative and absolute risks, including the fitting of linear RR models.
Prentice and Mason (1986) and Moolgavkar and Venzon (1987) discuss inferences based on the linear RR model and note that the distribution of the maximum likelihood estimate of β may be highly skewed, and that confidence intervals based on the estimates of the asymptotic standard error (Wald method) can be seriously misleading. Re-–parameterizing the model as β = exp(α) is sometimes helpful but does not allow for the possibility that β or its lower confidence bound may be negative. Another difficulty is that, to ensure that the RR is nonnegative, it is necessary to constrain the parameter β to be larger than −1/DMAX, where DMAX is the maximum dose in the study. These problems may be particularly severe in studies of nuclear workers, where dose distributions are highly skewed and estimates of β are often very imprecise. For this reason, tests and confidence intervals in nuclear worker studies have sometimes
been based on the likelihood ratio, or on score statistic approximations, or on computer simulations (Gilbert 1989), which can lead to intervals that are not symmetric on either a linear or a logarithmic scale. In some situations, especially in studies with sparse data, the estimate and/or the lower confidence bound for β may be negative; some investigators report such findings simply as <0.
INTERPRETATION OF EPIDEMIOLOGIC DATA
Assessment of Associations
After epidemiologic data have been collected and analyzed, the associations noted in the data must be interpreted. The measures of association and of statistical precision that have been computed have no inherent meaning; they reflect only the data that have been accumulated in the study. It is possible that these data have resulted from bias, error, or chance and thus have no interpretive meaning. A formal evaluation of the study design and of the methods used to collect and analyze the data is needed to assess the meaning of the data.
The first step in the interpretation of data is to assess the methods used in the study itself. The following questions must be considered:
Is there evidence that selection bias has been avoided in enrolling the study subjects?
Is there evidence that information bias has been minimized in assessing exposure or disease?
Is there evidence that the potential confounding influence of other factors has been addressed?
Is there evidence for sufficient precision in the measure of exposure or of disease to permit a reasonable basis for interpretation?
The possible occurrence of selection bias or of information bias may be assessed only by evaluation of the methods used in data collection. If either of these biases is judged to have an appreciable likelihood of being important, no analyses can be conducted to adjust for the error that may have been introduced. The data must be regarded as unsuitable for the purpose at hand. In contrast, potential confounding bias can be assessed and usually controlled by analytic strategies for factors on which information has been collected. There will always remain factors that have the potential for confounding but for which no information is available, including factors that are not even suspected of being confounders. This does not mean that no interpretation is possible, but it does mean that some degree of caution is needed in interpreting any association between radiation exposure and disease.
Chance is always a possible explanation for any association (or lack of association) in a scientific study, no matter how strong or how statistically significant the association. The p-value or confidence interval that is computed estimates only the likelihood that chance alone could have accounted for the observed association. The p-value does not distinguish between a true association and one that is due to bias or error. Also, interpretation of the likely range of an association based on its confidence interval reflects only the play of chance, not of error or bias. In addition, rare events do happen. Each p-value of the confidence interval should be examined with some care to determine whether a rare event is a plausible explanation for the statistical findings. Interpretation of the results of statistical analysis is as much an art as a science.
In all epidemiologic studies, measures of exposure and measures of disease are imprecise. This imprecision is not considered an error in methodology, but rather an inevitable occurrence associated with the assessment of observational data. When errors in measuring disease or exposure are random, unrelated to true disease and exposure, and independent among subjects, it is usually the case that measures of association are attenuated. That is, RRs are biased toward 1.0, the case of no association. In radiation epidemiology, errors in measuring disease (e.g., misdiagnosing cancer) are not different from disease misclassification problems in other epidemiology studies. Thus, the effect of disease misclassification is reasonably well understood. However, exposure measurement error problems in radiation epidemiology are often unique to radiation studies, and the effect of such errors generally is less well understood.
For most radiation epidemiology studies, measurements of exposure were not made at the time of exposure, but rather have been reconstructed some time after exposure using available information. For example, exposures for A-bomb survivors are calculated using sophisticated models for the spatial intensity of radiation and information about a subject’s location and local shielding at the time of exposure. It is likely that such measurements contain both random and nonrandom components. The effects of random errors in exposure measurements are reasonably well understood and include, in general, attenuation of estimated associations, underestimation of linear risk coefficients, and possible distortion of the shape of the dose-response relationship. The severity of these effects generally depends on the magnitude of the measurement errors (as measured by their variance) relative to the variability in true exposures. The effects of nonrandom errors in exposure measurements are specific to the nature of the error. For example, if a dosimetry system systematically overestimated exposures by 10%, the dose-response relationship would erroneously be stretched over a greater range of doses, the slope of the fitted line would be reduced, and linear risk coefficients would be underestimated by approximately 10%.
A second step in evaluating whether some exposure causes some disease is to assemble all of the relevant literature and to display all of the data that are regarded as relevant and of adequate quality. On occasion, a so-called meta-
analysis is conducted in which there is a quantitative summarization of the data. Such an analysis is not a necessary step and in fact may not be indicated. Only data from valid studies may be included in a meta-analysis, and among valid studies, all studies must contain similar information. In essence, a meta-analysis is a formal rather than an informal summarization of the epidemiologic literature.
A pooled analysis of data from similar studies is not the same procedure as a meta-analysis, but rather a useful extension of basic data analysis. An important tool for obtaining a broad assessment of the evidence from several studies is to conduct combined analyses of data from groups of similar studies. Analyses based on combined data provide tighter confidence limits on risk estimates than analyses based on data from any single study population. To the extent that biases found in individual studies tend to cancel out, combined analyses may help to reduce bias that results from confounding and other potential sources of bias. Such analyses also help to determine if differences in findings among studies are truly inconsistent or are simply the result of chance fluctuations. The application of similar methodology to data from all populations, in addition to the presentation of results in a comparable format, facilitates comparison of results from different studies.
A third step in interpretating epidemiologic data is to compare the results of an individual study with those of similar studies. The goal of such an exercise is to reach a judgment about whether, in general, it may be concluded that under certain conditions, an exposure causes a disease.
The so-called Bradford Hill criteria are the standard criteria used to assess whether the general epidemiologic literature on some exposure or some disease provides sufficient information to judge causality (Hill 1966). These criteria have been expanded, reduced, revised, and reinterpreted by countless authors to meet their special needs, but the core idea remains—use rational operational criteria to judge evidence from observational studies. A revised version of the Hill criteria follows:
Consistency—An association is seen in a variety of settings.
Specificity—The association is well defined rather than general.
Strength—The association is high or low rather than close to 1.0.
Dose-response—The higher the exposure, the higher is the rate of disease.
Temporal relationship—The exposure occurs before the disease.
Coherence—The association is believable based on information from other scientific disciplines.
Statistical significance—The association is statistically significant or not.
Each of these criteria should be considered in assessing whether an association between exposure and disease can be judged to be causal. Except for temporal relationship, there need not be evidence for each of these criteria.
With respect to the use of the Hill criteria in assessing the association between exposure to ionizing radiation and health outcome, they are of limited current value for human cancer. Ionizing radiation at high doses is acknowledged to be a cause of most relatively common human cancers (IARC 2000). The presence of a dose-response relationship for many cancers is considered strong evidence for a causal relationship. For less common cancers and for diseases other than cancer, there are not sufficient data to apply the Hill criteria. IARC (2000) notes: “A number of cancers, such as chronic lymphocytic leukaemia, have not been linked to exposure to x or γ rays.”
Assessment of Dose-Response Relationships
As noted above, evaluation of a dose-response relationship is one of the Hill criteria to be applied in assessing whether or not an association is judged to be causal. With respect to providing a risk estimate for low-dose, low-linear energy transfer radiation in human subjects, other information is necessary. Specifically, one needs relatively accurate information for individuals on dose from ionizing radiation, as well as a relatively complete measure of the incidence of or mortality from diseases. To date, the data from the survivors of the atomic bomb in 1945 in Hiroshima and Nagasaki have been the primary source of such information. The Radiation Effects Research Foundation has been responsible for estimating the exposure of individuals and for measuring the incidence and mortality of cancer and other diseases.
One of the primary tasks of this committee has been to evaluate the data that are available from studies of populations exposed to medical radiation, occupational radiation, and environmental radiation so as to assess whether information on dose-response associations from these data sources can be assembled and to evaluate whether such information can be compared to that obtained from the populations exposed to radiation from the atomic bombs. Chapters 7, 8, and 9 address these studies.