Read "Epidemiologic Studies of Veterans Exposed to Depleted Uranium: Feasibility and Design Issues" at NAP.edu

« Previous: 1 Introduction

Page 9 Cite

Suggested Citation:"2 Elements of an Epidemiologic Study." Institute of Medicine. 2008. Epidemiologic Studies of Veterans Exposed to Depleted Uranium: Feasibility and Design Issues. Washington, DC: The National Academies Press. doi: 10.17226/12200.

Page 10 Cite

Page 11 Cite

Page 12 Cite

Page 13 Cite

Page 14 Cite

Page 15 Cite

Page 16 Cite

Page 17 Cite

Page 18 Cite

Page 19 Cite

Page 20 Cite

Page 21 Cite

Page 22 Cite

Page 23 Cite

Page 24 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

2 Elements of an Epidemiologic Study This chapter describes the elements of an epidemiologic study that are es- sential in assessing the relationship between exposure to depleted uranium (DU) and health outcomes. The elements include identification of a relevant study population of adequate size; appropriate assessment and accurate measurement of uranium exposure in the population, including the use of biomarkers when avail- able; an evaluation of long-term health outcomes; adequate followup time; use of reasonable methods for controlling critical confounding factors and minimizing bias; and appropriate statistical analyses. Key issues related to sample size re- quirements for ensuring adequate statistical power in detecting small effects, the accuracy of exposure measurement, and the need to control critical confounders are addressed in detail. The chapter also briefly describes the various types of epidemiologic studies. IDENTIFYING STUDY POPULATIONS Identifying and ascertaining a relevant study population and a control or ref- erence population are crucial steps in conducting a well-designed epidemiologic study. It is essential that the study population be representative of the population of interest and that it be large enough to ensure adequate statistical power. A representative sample that is large enough will be more likely to capture accurate information about a purported association. In fact, a primary method of reducing sampling error in an epidemiologic study is to enlarge the sample. The population of interest is active-duty military and veterans, so it is critical that this population serve as the study population. By limiting the participants to military personnel or veterans, it will be possible to generalize the results to the

10 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS population of interest. Selection bias occurs when the study participants differ from nonparticipants in characteristics that might not be observed, that is, when the groups differ in measured or unmeasured baseline characteristics because of how participants were selected or assigned. Information on age would have to be captured, given the distribution of age among military personnel and the pos- sibility that age is a confounding factor. Sample Size As discussed above, adequate sample size is critical in conducting a well- designed epidemiologic study. A previous Institute of Medicine (IOM) report discussed the importance of adequate sample size for studying the health of Gulf War veterans: âsufficient samples sizes for each cohort in the study are crucial to ensure adequate statistical power to find differences as well as to reliably identify the lack of differences between groupsâ (IOM, 1999). Sample-size calculations are based on the expected magnitude of the dif- ference between the exposed and unexposed groups, the relative sizes of the groups to be compared, and specified levels for type I error (the error of reject- ing the null hypothesis when it is true) and type II error (the error of failing to reject the null hypothesis when the alternative hypothesis is true). Adequate followup time is also important, especially if the health outcome of interest has a long latent period (latency, followup, and other factors that should be taken into consideration in calculating sample size are discussed in more detail later in this chapter). To gain a sense of the expected sample sizes required for a high-quality epidemiologic study, the committee generated sample-size estimates for a cancer outcome (lung cancer) and a renal-function outcome (serum creatinine concen- tration). Those outcomes were selected because they are among the ones identi- fied as having high priority for further study by the committee in its report Gulf War and Health: Updated Literature Review of Depleted Uranium (IOM, 2008). Sample sizes for other high-priority health outcomes (lymphomas, respiratory disease, neurologic outcomes (including neurocognitive outcomes), and adverse reproductive and developmental outcomes) and for mortality should also be con- sidered. Those outcomes may be defined either by the diagnosis of a specific dis- ease entity (such as lymphoma) or by measurement of an important physiologic variable (such as serum creatinine or forced expiratory volume in 1 second as an indicator of renal function or lung function, respectively). Lung Cancer The committee estimated the minimum sample size required for detecting a statistically significant difference in risk of lung cancer between DU-exposed and unexposed subjects. Given that lung cancer is one of the more common can-

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 11 cers, the calculations can be viewed as a âbest-caseâ scenario in that detecting outcomes that are less common would be even more difficult and require larger samples than those described below. For the committeeâs calculations, the type I error was set at the conventional value of 5%, and power at 80%. Without a large enough sample to ensure a reasonable probability of detecting an association of a specified magnitude when one exists, there is a risk of a false-negative result (that is, failure to detect an association in the sample when one truly exists in the population). According to the American Cancer Society, the lifetime risk of a lung-cancer diagnosis is 7.91% (1 in 13) in men, and 6.18% (1 in 16) in women (ACS, 2008). Assuming that subjects can be followed long enough and closely enough to cap- ture most lung cancers, the calculations are based on a lifetime risk of 7.91% in men. Estimates of excess lifetime risk of lung cancer from DU exposure have been reported and range from 0.06 to 0.3% (summarized in Chapter 5 of IOM, 2008). If those increases are applied to the baseline risk, the estimated likely value of the relative risk (RR) would be between 1.008 and 1.038. Because those values are only rough estimates of the relative increase in risk, the committee computed sample sizes for various RRs, including 1.005, 1.01, 1.025, and 1.05. Finally, given that exposure to DU is relatively rare, the committee calculated the sample size for a cohort study assuming that 4 times as many unexposed subjects as exposed subjects would be studied (a 4:1 ratio requires more subjects overall than using an equal number of unexposed and exposed subjects but requires fewer exposed subjects). That ratio of unexposed to exposed subjects was selected in recognition that sampling ratios greater than 4:1 yield only minimal increments in power and precision. As shown in Table 2-1, the number of subjects required to detect a difference in lung-cancer risk in the hypothesized range (RR, 1.005-1.05) is prohibitive. Detecting smaller associations between DU and health effects would require studies with even greater numbers of exposed subjects to have sufficient statisti- cal power. The committee acknowledges the difficulty of detecting such a small relative risk in epidemiologic studies. In addition to the large sample size requirement TABLE 2-1â Number of Subjects Required to Detect a Difference in Lung- Cancer Risk Due to DU Exposure Relative Risk No. Exposed No. Unexposed Total Sample 1.005 4,577,746 18,310,984 22,888,730 1.01a 1,146,418 4,585,672 5,732,090 1.025 184,376 737,504 921,880 1.05 46,488 185,952 232,440 aAnticipated value.

12 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS and the expected small effect of DU exposure on the health outcome of interest, adjustment for other potential confounding factors (that is, factors associated with both DU exposure and the health outcome of interest) is challenging. Fur- thermore, even if other factors are controlled for in the analysis, the question of whether any observed effects might be explained by residual confounding remains. Issues related to confounding are discussed in greater detail below. Renal Function The committee considered the minimal sample size required to detect a statistically significant difference in mean serum creatinine concentration, a bio- marker for renal injury, between DU-exposed and -unexposed subjects. Again, the committee set the type I error at 5% and the power at 80%. According to data obtained from the National Health and Nutrition Examina- tion Surveys (NHANES), the mean serum creatinine concentration in the United States is about 1.0 mg/dL, with a standard deviation of approximately 0.34 (Selvin et al., 2007). The smallest change that would be considered clinically meaningful in a person would be about 0.25 mg/dL; however, a mean difference this large between exposed and unexposed groups would not be expected. Rather, the committee assumed that it would need sufficient power to detect that a larger fraction of the DU-exposed subjects than of unexposed subjectsâsay, 10% moreâexperienced a clinically significant increase (around 0.25 mg/dL) over long-term followup (20 years). Thus, the overall difference in the change in mean serum creatinine concentration between exposed and unexposed subjects would amount to about 0.025 mg/dL over a 20-year followup period. The committee computed the minimal sample sizes required to detect that difference between DU-exposed and -unexposed groups and to detect several smaller and larger dif- ferences: differences of 0.005, 0.01, 0.025, 0.05, and 0.10 mg/dL, corresponding to 2%, 4%, 10%, 20%, and 40% more exposed than unexposed subjects having experienced an increase of 0.25 mg/dL in serum creatinine (see Table 2-2). The calculations assume that 4 times as many unexposed subjects as exposed subjects TABLE 2-2â Number of Subjects Required to Detect a Difference in Serum Creatinine Concentration Due to DU Exposure Mean Difference (mg/dL) No. Exposed No. Unexposed Total Sample 0.005 45,367 181,468 226,835 0.01 11,342 45,368 56,710 0.025 1,815 7,260 9,075 0.05 454 1,816 2,270 0.10 114 456 570

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 13 would be recruited because it is anticipated that it will be much more difficult to identify and recruit sufficient numbers of DU-exposed subjects. Note that for mean differences of 0.025-0.10 mg/dL, samples of around 9,000 or fewer subjects, and sometimes substantially fewer, would yield suf- ficient power. That might make it possible to conduct a high-quality study of the association between DU exposure and renal function, as measured by serum creatinine concentration, as long as accurate exposure assessment and outcome assessment are available. If such a study is undertaken, the committee advises the investigators to inflate the targeted sample sizes above to compensate for exposure-classification issues, attrition of subjects by dropout, other losses to followup, and deaths over the 20-year followup period. Finally, the committee cautions that differences of the indicated magnitudes have not been observed in 74 DU-exposed Gulf War veterans who have been followed clinically since 1993 (McDiarmid, 2007). The committee also estimated the sample size required to detect an increased incidence of renal disease as defined by a serum creatinine concentration greater than 1.7 mg/dL. It considered a study in which military personnel are followed for a 20-year prospective followup after their discharge (and are assumed to be 30-39 years old at discharge). Information on the prevalence of renal disease (defined by increased serum creatinine) in age-, race-, and ethnicity-specific subgroups was based on NHANES data (Jones et al., 1998). To compute a disease inci- dence expected in subjects that are not exposed to DU, the committee made the following assumptions: the increase in prevalence when men 50-59 years old are compared with men 30-39 years old estimates the 20-year incidence, the racial and ethnic composition of the cohort would be similar to the reported composi- tion of enlisted personnel (65% white non-Hispanic, 20% black non-Hispanic, 9% Hispanic, and 6% other) (DOD, 2004), and the prevalence in subjects of âotherâ racial and ethnic composition can be approximated on the basis of the overall US age-specific prevalence. Using those assumptions, race- and ethnicity- specific prevalence estimates could be combined to obtain an overall estimated disease prevalence (prevalence of increased serum creatinine concentration) of 0.81% in subjects 30-39 years old and an increase to a prevalence of 3.51% in subjects 50-59 years old, which yielded an estimated 20-year cumulative inci- dence of 2.71%. The committee computed the sample sizes required to have 80% power to detect an increased risk of renal disease in DU-exposed veterans (see Table 2- 3). The sample size required for various assumed RRs when RR characterizes the magnitude of the increase in incidence in exposed veterans compared with unexposed veterans was determined. For example, a RR of 1.50 assumes that the exposed veterans would have a cumulative incidence of disease that is 1.50 times as high as that in the unexposed veterans (1.50 Ã 2.71% = 4.07%) over 20 years of followup. Table 2-3 presents the numbers of evaluated subjects that would be required. In practice, sample-size targets would need to be increased to factor in

14 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS TABLE 2-3â Number of Subjects Required to Detect an Increased Risk of Renal Disease Due to DU Exposure Relative Risk No. Exposed No. Unexposed Total Sample 1.10 37,117 148,466 185,583 1.25 6,356 25,423 31,779 1.50 1,760 7,038 8,798 2.00 523 2,092 2,615 exposure classification, anticipated attrition of subjects by dropout, other losses to followup, and death over the 20-year followup period. As in the above examples, the committee assumed a ratio of 4:1 unexposed to exposed subjects. The calculations suggest that about 9,000 subjects would need to be enrolled to detect a RR of 1.50. Comparison-Group Issues Selecting an appropriate comparison group is another critical element in a study. In the present case, an ideal comparison group would include active-duty military personnel or veterans without the exposure of interest. Using an unex- posed veteran or military population for the comparison group would help to reduce a type of selection bias called the healthy-warrior effect, which otherwise may underestimate the association between the exposure to DU and the outcome of interest. Military personnel are subject to this type of bias in that their expected mortality and morbidity are lower than those in the general population. The previous IOM report (1999) notes that comparison groups should be sampled and surveyed in the same way (that is, at the same time and using the same methods) as the exposed group to maximize comparability of data obtained on all groups. The report also recommends collecting information on potential confounding factors in both the study and comparison groups, including educa- tion, sex, and other correlates of health status (for example, smoking status and alcohol intake) (IOM, 1999). EXPOSURE ASSESSMENT Accurate characterization of information on exposure is an essential com- ponent of an epidemiologic study. Rothman and Greenland (1998) note that the âquality of exposure measurement will determine the validity of an environmen- tal epidemiology study. . . . The importance of exposure assessment in environ- mental epidemiology cannot be overstated.â Direct measurements of exposure at the individual level are always pref-

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 15 erable, but such data are not always available. Surrogate information may be obtained from a variety of other sources, including questionnaires designed to capture self-reported exposure information, data on specific job exposures (for example, information on job types and job-exposure matrices), and environmen- tal monitoring. Monitoring Various methods are available for assessing individual exposure to agents of interest, including biomonitoring, personal monitoring, and environmental monitoring. Biomonitoring Biomonitoring is a method of assessing exposure to chemicals by measuring them or their metabolites in urine and blood. Biomarkers are alterations at the cellular, biochemical, or molecular level that can serve as an indicator of expo- sure to a chemical. They can indicate the absorbed dose or be used to provide an estimate of target-tissue dose. It is critical to understand the timeframe of expo- sure as reflected by biomarkers. For instance, a chemical with a short half-life might be detected in human tissue only a short time after exposure and be used to measure only recent exposure. In some cases, multiple measures of exposure can be used to develop an accurate assessment of historical exposures. Biomonitoring is useful for measuring the body burden of DU because it can remain in the body for long periods. Parrish and colleagues (2007) have demonstrated that urinary uranium concentration can reflect DU exposure up to 20 years after the fact. The committee believes that collecting biomonitoring data (for example, urinary uranium concentration) is essential for assessing the burden of DU in military and veteran populations. Other approaches, such as questionnaires and review of military-activity records, are unlikely to yield as accurate an assess- ment of exposure because of recall bias and exposure misclassification. Study of biomarker data is a necessary first step in characterizing DU exposure for epidemiologic studies. Clear interpretation and communication of biomonitoring data are critical. For example, it can be difficult to communicate to a person that the finding of a statistically significant difference in mean urinary uranium concentration does not necessarily imply important clinical toxicity. Statistical significance gives an indication of the probability that observed differences between groups are due to chance. In studies that have large samples, a result can be statistically significant even if little or no clinical importance is associated with it. And clinically sig- nificant findings may be missed when samples are too small to allow statistical significance. Practical measures related to good communication of biomarker data, as dis-

16 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS cussed in a National Research Council report (NRC, 2006), include using consis- tent terminology and concepts, expanding biomonitoring for relevant populations, providing communication training, and documenting methods of reducing expo- sures. The report also includes recommendations regarding the interpretation of biomonitoring data and research on communicating results. In addition, the 2003 Department of Defense (DOD) Health Affairs Policy 03-012 states that DOD must use âeffective health risk communication tools to ensure those exposed to DU understand the exposure assessment, urine DU bioassay results, if applicable, the VA referral, and have all their questions fully answeredâ (DOD, 2003). Personal Monitoring Occupational studies often use personal monitoring to measure the radiation exposure of each worker. Personal dosimetry film badges can indicate cumula- tive exposure to external radiation. The badges are worn on the clothing of the employees and continuously record exposure to x-rays, gamma rays, and beta particles. The advantages of using film badges include the ability to link a specific dose to a particular exposure incident and a specific dose to an individual worker. In addition to measuring radiation exposure, other types of personal monitoring devices can be used to assess chemical exposure. Environmental Monitoring Environmental monitoring to assess the concentration of agents in the en- vironment is an important tool for assessing external exposure. Environmental monitoring can be used to measure ambient air, soil, sand, and water samples. It is important to distinguish between exposure measured in the external environment and internal dose measured in human tissue. Environmental monitoring can yield an ecologic measure that can be most valuable when exposure is widespread in some geographic areas or periods under study (Rothman and Greenland, 1998). It also can be used to determine individual exposures based on time-activity diaries and residential histories. Using Work History to Assess Exposure One approach to assessing exposure in occupational epidemiologic studies is to approximate individual exposure by modeling cumulative exposure on the ba- sis of a workerâs job history and the level of exposure in each worksite. Exposure to a particular agent is measured in various worksites. This information is used to model the cumulative lung dose per unit time in the worksite. Employment records are then used to determine the amount of time that each worker spent in each job and the workerâs cumulative exposure in all worksites over the course of his or her employment.

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 17 The modeling approach in effect assigns to each worker the average exposure in each worksite. Compared with direct measurement, this approach loses infor- mation in that workers in a given site may vary in their exposure. Any approach that blurs the distinction between individual workersâ exposure while maintain- ing the distinction between workersâ health outcomes will reduce the variation in the sample. That biases a study toward failing to detect an association between exposure and health outcomes. Another approach measures average environmental exposure in each work- site, as described in the preceding section, and combines worksites into a rela- tively small number of groups according to magnitude of exposure. In some cases, exposure is not measured; rather, the judgment of experts in classifying worksites by extent of exposure is used. However, instead of estimating cumula- tive exposure over all worksites, this method simply assigns a worker to the site that has the highest exposure of all the sites in which the employee worked for at least some minimal period (usually 1 month). The approach is a crude form of exposure modeling in that it reduces the variation among workersâ exposure in two ways. First, it assumes that an employee spent his or her entire period of employment in one group of worksites, although the worker may have spent time in sites that varied considerably in exposure. Second, it combines sites that may vary widely in their magnitude of exposure. For those reasons, this approach is especially prone to false-negative results (failing to detect a doseâresponse relationship). Other Methods of Assessing Exposure Other methods of collecting exposure information include face-to-face or telephone interviews, questionnaires distributed by mail or electronically, and combinations of these. Depending on the type of information needed for a study, a combination may be appropriate. When cost is a factor, a secure electronic ques- tionnaire or survey is an appealing option. Face-to-face interviewing is generally considered a more reliable method of obtaining information because interviewers have the opportunity to observe study participants, ask followup questions, and clarify responses (IOM, 1999). Telephone interviewing and mail surveys are less expensive than face-to-face interviews, but may result in higher nonresponse rates. And mail surveys may encourage more honest responses. Regardless of the method of data collection, response rate is always impor- tant to consider. There are a variety of reasons why participants may not respond; some cannot be located, others may be sick or not interested in participating, and so on. Obviously, the goal is to maximize the number of participants who respond. A previous Gulf War committee (IOM, 1999) recommended two op- tions for improving response rates: encouraging veteran participation in organiz- ing, designing, and implementing the study; and offering incentives for study participants.

18 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS Control of Bias and Confounding in Exposure Assessment A study should use reasonable methods to control for confounders and mini- mize bias in both exposure and outcome assessments. Bias refers to systematic or nonrandom error. It causes an observed value to deviate from the true value and can weaken an association, exaggerate it, or reverse its direction. All studies are susceptible to bias, so a goal of study design is to minimize it or to adjust the observed value of an association by using special methods to correct for it. Bias related to self-reporting of exposure and bias related to exposure misclassification may compromise the results of an exposure assessment. Other common sources of information bias are the inability of study subjects to recall accurately the circumstances of their exposure (recall bias) and the likelihood that one group reports what it remembers more often than another group does (reporting bias). Self-Reporting in Exposure Assessment Self-reporting of exposure is a potential study limitation. A study partici- pantâs ability to recall details of exposure accurately over a long period can vary greatly and is likely to be severely limited. In addition, recall can be influenced by whether the participant has experienced adverse health outcomes. Self-reports of exposure can result in imprecise and even invalid assessments of both exposure and outcome. Exposure Classification and Misclassification Misclassification is defined as the âerroneous classification of an individual, a value, or an attribute into a category other than that to which it should be as- signedâ (Last, 1988). Exposure misclassification can occur when a study par- ticipant does not have detailed and accurate knowledge about a past exposure. In some cases, participants may not even be aware that they have been exposed. Misclassification may be differential or nondifferential. Differential misclassifi- cation occurs when the rate of misclassification differs between the study groups, and nondifferential misclassification occurs when the rate of misclassification does not differ between study groups (Gordis, 2000). Exposure misclassification can be reduced by obtaining information on exposure from more than one source, by validating exposure classification with an external dataset, or by measuring exposure objectively, such as with biomonitoring. OUTCOME ASSESSMENT Accurate and comprehensive exposure information is necessary to be able to complete the second step of an epidemiologic study: assessing the relationship of health outcomes to exposure. Discussed below are factors that should be incor-

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 19 porated into an assessment of health effects related to DU exposure, including a focus on ascertainment of outcomes, an adequate followup period, control of critical confounders, and appropriate statistical analysis. Ascertainment of Outcomes There are a variety of methods for ascertaining health outcomes, including reviewing death certificates, medical records, and data from clinical examina- tions; linkage to disease registries; and use of self-reported outcomes. Death certificates and medical records can be useful as sources of information on health outcomes. Diagnoses are usually provided by trained health-care provid- ers, although recording errors or misdiagnoses may occur. Death certificates are comprehensive in coverage but do not capture nonfatal adverse health outcomes. Like self-reported exposure information, self-reported outcome information can include bias because subjects may not recall the information accurately. It is therefore important to verify self-reported information through physician diag- noses, death certificates, or disease registries. Background Rate of Disease To evaluate whether an increase in the number of cases of a particular dis- ease is related to exposure to DU, it is necessary to have information about the background rate of the disease in the population not exposed to DU. The back- ground rate of a disease can be used to determine whether there is an âexcessâ of cases or the rate is what would be expected in the population. A rate of disease in the study population that is in excess of the background rate may indicate an increased risk related to exposure to DU. Adequate Followup Period An adequate followup period is necessary to allow sufficient time after expo- sure for relevant health outcomes to occur. That is particularly true for outcomes that have long latent periods, as do most cancers and renal diseases. Biologic latency of cancer should be given consideration because there is a delay between exposure to a carcinogen and the onset of disease. Control of Bias As discussed above, a well-designed study should use appropriate methods to minimize bias. All studies are susceptible to bias, so the design of the outcome assessment should minimize bias or adjust the observed value of an association by using appropriate methods to correct for it. Information bias and bias related

20 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS to self-reporting of health outcomes may compromise the results of an outcome assessment. Information bias can result from the method with which data are collected and ultimately can cause measurement errors and imprecision. Information bias can also result from misclassification of study subjects with respect to the out- come variable. It can lead to misinterpretation of study results when it affects one comparison group more than another. Controlling for Confounders and Data Analysis As mentioned above, an observation of association between DU exposure and any health outcome of interest could be confounded by other factors that are directly related to both the likelihood of exposure to DU and the outcome. For example, military personnel who serve in roles that pose an increased risk of DU exposure may be more likely to be smokers, and smoking has a well-known causal association with lung cancer. In principle, the influence of confounding factors on the assessment of the relationship between DU exposure and health outcomes can be overcome to some extent through study designs and data-analy- sis schemes that address such three-way relationships. One approach to controlling for the influence of confounding factors is a study design that uses algorithms for matching study subjects according to the confounding factors, for example, ensuring that the DU-exposed and -unexposed study groups have nearly identical patterns of smoking. That tech- nique can work well if there are only a few potential confounders of interest, but it becomes infeasible when there are many confounders. Alternatively, multivariable data-analysis techniques (such as stratified analysis and regres- sion modeling) can effectively control for confounding by multiple factors simultaneously (for example, age, smoking, and sex). However, the effective- ness of such methods depends in part on the ability to ascertain exposures to potential confounding factors with accuracy and precision. Furthermore, the challenge of controlling for the confounding factors rises in settings where the association of interest (health outcomes of DU exposure) is small and the relationship between the confounder and either DU exposure or the health outcome is large. Appropriate data analyses should incorporate methods for reducing bias and typically begin with careful examination of the distributions of the analytic variables, including exposures, health outcomes, and confounders. Statistical analyses should always be chosen to make the best use of all the available data, reduce potential sources of bias as much as possible, and closely reflect the aims and design of the study. Before the implementation of analytic techniques, such as statistical model- ing and stratification, to control for confounding factors, the potential for interac- tions between DU exposure and other factors should be considered. Interaction

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 21 effects are different from confounding effects. Their hallmark is a substantial difference in the magnitude of an association between DU exposure and a health outcome in subgroups, for example, a DU exposure-lung cancer relationship may differ between smokers and nonsmokersâowing to biologic synergy between the interacting factor (smoking) and DU with respect to their joint effect on an out- come (lung cancer). When interactions are detected, effects must be estimated and reported separately by subgroup (for example, in smokers and in nonsmokers) to assess the relationship between DU exposure and the health outcome validly. Such analyses would mandate an even larger total sample and expand the chal- lenge of conducting such research, given the large number of subjects needed even in the absence of interactions. ASSESSING THE STRENGTH OF THE EVIDENCE A well-designed epidemiologic study that includes accurate exposure as- sessment and outcome assessment can be used to evaluate exposure to DU and reach conclusions about the potential for adverse health outcomes. Previous IOM reports, including Gulf War and Health, Volume 1: Depleted Uranium, Pyridostigmine Bromide, Sarin, Vaccines (IOM, 2000), have noted the following general considerations for assessing the strength of evidence patterned after those introduced by Hill (1965): strength of association, doseâresponse relationship, consistency of association, temporal relationship, specificity of association, and biologic plausibility. â¢ Strength of association is typically expressed as the magnitude of the measure of effect (for example, a RR). The stronger the association, the less likely the relationship is due to confounding. â¢ A dose-response relationship is observed when study findings indicate a greater health effect or response after greater exposure. The steeper the doseâ response relationship, the more confidence one can have that the association is real; however, the absence of such a relationship does not necessarily indicate that there is no possibility of an association. â¢ A consistent association is observed when the magnitude and direction are similar among several studies that include different populations, locations, and times. A larger number of studies that have the same results constitute a greater indication that there is a consistent association. â¢ A temporal relationship, evidence that exposure occurred before the outcome, is necessary to determine causality. â¢ Specificity is defined as the unique association between exposure to a specific agent and a specific health outcome; that is, the health outcome does not arise in the absence of exposure to the agent. Specificity of association is some- what rare given the multifactorial etiology of many health outcomes and the broad spectrum of health outcomes that may occur after exposure.

22 FEASIBILITY AND DESIGN OF STUDIES OF DU-EXPOSED VETERANS â¢ Biologic plausibility reflects knowledge of the biologic mechanism by which an agent can lead to a health outcome. That knowledge comes through a variety of studies (typically animal studies), including studies that assess mecha- nisms of action and studies in pharmacology, toxicology, microbiology, physiol- ogy, and other fields. Biologic plausibility is often difficult to establish, and it might not be known when an association is first documented. EPIDEMIOLOGIC STUDY DESIGNS Three major types of epidemiologic studies are cohort, case-control, and cross-sectional studies (study designs are discussed in more detail in IOM, 2000). A cohort, or longitudinal, study follows a defined group over time. It can test hypotheses about whether an exposure to a specific agent is related to the devel- opment of a health effect and can examine multiple health effects that may be associated with exposure to a given agent. A cohort study starts by classifying study participants according to whether they have been exposed to the agent under study, in this case DU. A cohort study compares health effects in people who have been exposed with those in people who have not been exposed. Such a comparison can be used to estimate a risk difference or a RR, two summary measures of association. In a case-control study, case subjects (cases) are selected from among people in the population who have experienced a specific health effect, and controls are selected from among those in the population who have not experienced the health effect. Prior exposure of both cases and controls to specific agents is as- sessed. Such characteristics as age, sex, and socioeconomic status are recorded to permit control of their potential confounding influence in the analysis of the exposureâdisease relationship of interest. The odds of exposure to an agent among the cases are then compared with the odds of exposure among controls to generate an odds ratio (OR), which is a summary measure that is typically interpreted as the RR of a health effect in those exposed to the agent compared with those not exposed. In a cross-sectional study, exposure information and health-effects informa- tion are collected at the same time. The selection of people for the studyâunlike selection for cohort and case-control studiesâis independent of exposure to the agent and of health-effects characteristics. Cross-sectional studies seek to uncover associations between exposure to a specific agent and development of a health effect. In a cross-sectional study, effect size is measured as RR, prevalence ratio, or prevalence OR. SUMMARY This chapter has discussed the elements of an epidemiologic study that are essential in assessing the relationship between exposure to DU and health out-

ELEMENTS OF AN EPIDEMIOLOGIC STUDY 23 comes. They include identification of a relevant study population of adequate size; appropriate assessment of uranium exposure in the population, including the use of biomarkers, when available; evaluation of health outcomes; adequate followup time; use of reasonable methods for controlling for confounders and minimizing bias; and appropriate statistical analysis. The committee estimated the minimal sample size required to detect statistically significant differences in risk of lung cancer and in renal malfunction between DU-exposed and -unexposed samples. REFERENCES ACS (American Cancer Society). 2008. Cancer facts and figures. Atlanta, GA: American Cancer Society. DOD (Department of Defense). 2003. HA policy 03-012: Policy for OIF DU medical management. Washington, DC: Department of Defense. âââ. 2004. Population representation in the military services. http://www.defenselink.mil/ prhome/poprep2000/html/appendixb/b_25.htm (accessed October 13, 2007). Gordis, L. 2000. Epidemiology. Vol. II. Philadelphia, PA: W.B. Saunders Company. Hill, A. B. 1965. The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine 58:295-300. IOM (Institute of Medicine). 1999. Gulf War veterans: Measuring health. Washington, DC: National Academy Press. âââ. 2000. Gulf War and health, volume 1: Depleted uranium, sarin, pyridostigmine bromide, vaccines. Washington, DC: National Academy Press. âââ. 2008. Gulf War and health: Updated literature review of depleted uranium. Washington, DC: The National Academies Press. Jones, C. A., G. M. McQuillan, J. W. Kusek, M. S. Eberhardt, W. H. Herman, J. Coresh, M. Salive, C. P. Jones, and L. Y. Agodoa. 1998. Serum creatinine levels in the U.S. population: Third national health and nutrition examination survey. American Journal of Kidney Disease 32(6):992-999. Last, J. M. 1988. A dictionary of epidemiology. 2nd ed. New York: Oxford University Press. McDiarmid, M. 2007 (unpublished). Depleted uranium (DU) follow-up program update. Presentation to the IOM committee on Gulf War and health: Depleted uranium. University of Maryland. NRC (National Research Council). 2006. Human biomonitoring for environmental chemicals. Wash- ington, DC: The National Academies Press. Parrish, R. R., M. Horstwood, J. G. Arnason, S. Chenery, T. Brewer, N. S. Lloyd, and D. O. Carpenter. 2007. Depleted uranium contamination by inhalation exposure and its detection after approxi- mately 20 years: Implications for human health assessment. Science of the Total Environment 390(1):58-68. Rothman, K. J., and S. Greenland. 1998. Modern epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven. Selvin, E., J. Manzi, L. A. Stevens, F. Van Lente, D. A. Lacher, A. S. Levey, and J. Coresh. 2007. Calibration of serum creatinine in the National Health and Nutrition Examination Surveys (NHANES) 1988-1994, 1999-2004. American Journal of Kidney Disease 50(6):918-926.

Next: 3 Available Datasets »

Epidemiologic Studies of Veterans Exposed to Depleted Uranium: Feasibility and Design Issues (2008)

Chapter: 2 Elements of an Epidemiologic Study

Welcome to OpenBook!

Get Email Updates