Methodologic Considerations in Evaluating the Evidence
The committee has undertaken the task of summarizing the strength of the scientific evidence concerning the association between herbicide exposure during Vietnam service and each of a set of diseases or conditions suspected to be associated with such exposure. For each disease, the committee has determined, to the extent that available scientific data permit meaningful determinations,
whether a statistical association with herbicide exposure exists, taking into account the strength of the scientific evidence and the appropriateness of the statistical and epidemiologic methods used to detect the association;
the increased risk of each disease in question among those exposed to herbicides during Vietnam service; and
whether there exists a plausible biologic mechanism or other evidence of a causal relationship between herbicide exposure and the disease in question.
The committee was not provided, by the legislation establishing it, with a specific list of diseases and conditions suspected to be associated with herbicide exposure. The committee staff and members developed such a list based on the diseases and conditions that had been mentioned in the scientific literature or in legal documents that came to their attention through very extensive literature searches, as described in Appendix A.
The judgments made by the committee have both quantitative and qualitative aspects, and they reflect the evidence examined and the approach taken to
evaluate it. In this chapter, the committee describes more fully how it has approached its task, in the hope that readers may then be in a better position to assess and interpret the committee's findings. By offering this information, the committee wishes to make the report useful to those who may seek to update its conclusions as new information is obtained. This chapter outlines the specific questions posed by the committee, the types of evidence it identified, its approaches to evaluating reports both singly and collectively, and the nature of the conclusions it felt that logic and evidence permitted. Against this background, details of the analysis and specific conclusions concerning each health effect appear in subsequent chapters. This chapter is based on a similar description in an Institute of Medicine report Adverse Effects of Pertussis and Rubella Vaccines (IOM, 1991), adapted to the current task.
Attributes of the diseases being considered, as well as the population exposed to herbicides, influenced the committee's analysis. The diseases can be characterized, for example, by their frequency, by the specificity of their symptoms, and by prior knowledge of their etiology and pathogenesis. Diseases, such as non-Hodgkin's lymphoma, that occur only rarely in exposed persons are more difficult to study than those that occur more frequently. Conditions such as soft tissue sarcoma that are ill defined, birth defects that are known to occur in the absence of herbicide exposure, or conditions that generally have unknown causes or mechanisms of development are also inherently difficult to investigate.
When the actual intensity or duration of exposure to a potential disease-causing agent is difficult to measure, as is generally true for herbicide exposure in Vietnam, comparisons between presumably exposed and presumably nonexposed persons become clouded. This is due to the misclassification of truly exposed individuals as unexposed or, more likely, the misclassification of truly unexposed people as exposed. For example, some studies compare veterans with experience in Vietnam to veterans who served during the same period of time, but not in Vietnam (Vietnam era veterans). If such a classification system is used as a surrogate for exposure to herbicide, it is likely that a substantial number of those presumed to be exposed on the basis of Vietnam service had either minimal or no actual exposure. The committee deemed the issue of exposure measurement to be so important that a separate chapter of this report is devoted to that topic. A section of Chapter 6 also addresses the assessment of exposure in epidemiologic studies.
It was because of the uncertain validity of exposure measurements in many of the studies of veterans that the committee decided to review studies of other groups potentially exposed to the herbicides contained in Agent Orange, to other herbicides, or to dioxin, the contaminant presumed by some to be the actual cause of the purported adverse effects of Agent Orange. These other groups include industrial and agricultural workers, Vietnamese
citizens, and people exposed to environmental sources as a result of residing near the site of an industrial accident. The committee felt that considering studies of other groups would help address the issue of whether these compounds could be associated with particular health outcomes in veterans, although that would have only an indirect bearing on the question of association in veterans themselves.
In any epidemiologic study comparing an exposed to an unexposed group, it is likely that characteristics other than exposure may differ between the two groups. For example, the group exposed to herbicide in an industrial study might have a higher or lower prevalence of cigarette smokers than the unexposed group. When the groups differ with respect to factors that are also associated with the risk of the outcome of interest, a simple comparison of the groups may either exaggerate or hide the true difference in disease rates that is due to the exposure of interest. In the example of higher prevalence of smoking in the exposed workers, a simple comparison of lung cancer rates in the exposed and unexposed would exaggerate an apparent difference in lung cancer rates, since smoking is known to cause lung cancer. If the exposed workers had a lower prevalence of smoking, the simple comparison would tend to mask any true association between exposure and lung cancer by spuriously elevating the risk of the disease in the unexposed group. This phenomenon, known as confounding, represented another major challenge to the committee.
QUESTIONS TO BE ADDRESSED
What would it mean to say that exposure to herbicides is associated with one or another type of health effect? It would not mean that exposure invariably produces the disease or adverse health outcome, or that all cases of the disease were due to the herbicide. Such complete correspondence between exposure and disease is by far the exception in public health and does not occur in this context, or the present review would not be required.
In the present review, the committee has been concerned with two kinds of questions about associations. The first of these questions about exposure to herbicides is, in general, is there a statistical association with the specified adverse condition? For example, is exposure to the herbicide 2,4-D associated with an increased incidence of soft tissue sarcoma (STS)? If the conclusion is affirmative, a second question becomes pertinent: Under the assumption that exposure to herbicides is associated with an outcome in any group, what is the association among Vietnam veterans? That is, if an association is found to exist among occupationally exposed individuals, is the association also found, or likely to be found, in Vietnam veterans? Discussion of each of these questions can help to clarify the committee's view of its task. A third question, relating to the biologic plausibility that
the outcome in question was caused by the herbicide exposure, is dealt with through an examination of the toxicological literature, as reviewed in Chapter 4.
Are Herbicides Statistically Associated with the Health Outcome?
The work of the committee necessarily focused on a pragmatic question: What is the nature of the evidence relevant to drawing its conclusion about statistical association? In pursuing this question, the committee recognized that an absolute conclusion about the absence of association may never be attained. As in science generally, studies of health outcomes following herbicide exposure are not capable of demonstrating that the purported effect is impossible or could not ever occur. Any instrument of observation has a limit to its resolving power, and this is true of epidemiologic studies as well. Hence, in a strict technical sense, the committee could not prove the absence of any possibility of a health outcome associated with herbicide exposure. Nevertheless, for some outcomes examined for which there was no evidence consistent with an association, there was limited or suggestive evidence consistent with no association, and the committee was able to conclude within the limits of the current resolving power of the existing studies that there is no association with herbicide exposure.
The evidentiary base that the committee found to be most helpful derived from epidemiologic studies of populations, that is, investigations in which large groups of people are studied to determine the association between the occurrence of particular diseases and exposure to the substances at issue. To determine whether an association exists, epidemiologists estimate the magnitude of an appropriate quantitative measure (such as the relative risk or the odds ratio) that describes the joint occurrence of exposures and diseases in defined populations or groups. Usage of ''relative risk," "odds ratio," or "estimate of relative risk" is not consistent in the literature reviewed and cited in this report. In its own usage, the committee intends relative risk to be used to refer to the results of cohort studies, and estimates of relative risk or odds ratio to refer to the results of case-control studies (see Glossary for definitions). Values of relative risk greater than 1 may indicate a positive, or direct (harmful), association and are emphasized in the discussion in this chapter; values between 1 and 0 may indicate a negative, or inverse (protective), association. "Statistical significance," that is whether the increased risk is sufficiently greater than 1 to exclude the possibility that the apparent effect is due to chance, must also be considered.
Formally, in planning an investigation, an epidemiologist poses a hypothesis to the effect that the exposures and health outcomes under study are not associated. Under this hypothesis, the value of the measure of association used is theoretically expected to be approximately 1. This is
termed the null hypothesis or the hypothesis of no association. The measure of association derived from the investigation is then tested statistically. To "reject the null hypothesis," or to conclude that exposures and events are not independent, is to conclude that there is evidence of an association.
When more than one epidemiologic study has been conducted, it may be instructive to combine their results so as to reach a stronger conclusion than a single study can provide. This process, termed meta-analysis, is described more fully later in this chapter.
Determining whether an observed exposure-disease association is "real" requires additional scrutiny. This is because there may be alternative explanations, other than exposure, for the observed association. These include errors in the design, conduct, or analysis of the investigation; bias, or a systematic tendency to distort the measure of association from representing the true relation between exposures and outcomes; confounding, or distortion of the measure of association because another factor, related to both exposures and outcomes, has not been recognized or taken into account in the analysis; and chance, the effect of random variation in producing observations that can, in reality, only be approximations to the truth and can, with a known probability, sometimes depart widely from the truth.
In deciding whether associations between herbicides and particular outcomes exist then, it has been the committee's task to judge in each instance whether there is evidence of an association from the available studies and, if so, whether it is direct or inverse, and whether it is due to error, bias, confounding, or chance or, instead, most likely to be due to a true association between herbicides and outcome.
What Is the Increased Risk of the Disease in Question Among Those Exposed to Herbicides in Vietnam?
The second question, which becomes pertinent principally (but not exclusively) if the answer to the first question is affirmative, concerns the likely magnitude of the exposure-disease association in Vietnam veterans exposed to herbicides. The most desirable evidence as a basis for answering this type of question involves knowledge of the rate of occurrence of the disease in those Vietnam veterans who were actually exposed to herbicides, the rate in those who were not exposed (the "background" rate of the disease in the population of Vietnam veterans), and the degree to which any other differences between exposed and unexposed groups of veterans influence the difference in rates. When those Vietnam veterans who are actually exposed have not been identified properly, as has generally been the case in existing studies, this question becomes difficult to answer. By considering the magnitude of the association observed in other cohorts, the quality and results of existing studies of veterans related to a particular outcome, and
other principles of epidemiologic research discussed below, the committee formulated a qualitative judgment regarding the second question.
BURDEN OF PROOF
In approaching its task, the committee considered the concept of "burden of proof" and its place in such an evaluation. This concept implies that one position or another concerning association is presumed to be true unless it is offset by evidence to the contrary. The prior position might be either affirmative or negative. That is, it may be assumed that an exposure is harmful unless sufficient evidence of safety is present; alternatively, it may be assumed that an exposure is safe unless convincing evidence of harmful effects is present. In either case, it is sometimes argued that a burden of proof must be fulfilled before the presumed position is rejected. In general, it is desirable to avoid making an error in either direction—concluding either that there is or that there is not an association when the opposite is true. Reducing the chance of such mistaken conclusions depends on careful assessment of the evidence, including consideration of possible errors, bias, and confounding.
The role of chance in leading to erroneous conclusions as a result of random variation in sampling or in other respects is customarily handled through formal statistical analyses, which are based on assumptions from probability theory. Statistical measures can suggest the likelihood that conclusions as to the presence or absence of an association will each be in error. In general, a result is said to have greater statistical significance as the probability of error in accepting an association becomes smaller. More technically, the level of statistical significance is the probability of observing by chance at least as great a difference as that observed between an "experimental" (exposed) and a "control" (unexposed) group, if the risk of the disease were, in truth, identical in the two groups. The likelihood that a true association will be correctly detected in an investigation is a statistical property of the investigation, termed its power. Both statistical significance and power reflect the role of chance in scientific observations and the concomitant uncertainty in all scientific conclusions. One obvious implication of this understanding is that the concept of "proof" in its commonsense meaning is not strictly applicable to scientific observations. Even when scientists conclude that an experiment demonstrates ("proves") an association, they know there is a small, known, probability that the conclusion is incorrect.
Consideration of the role of chance should not be made in a vacuum. When one is addressing the issues of statistical significance and power, there is an underlying assumption that a study is free from both bias and confounding. To the extent that these assumptions may not be true in a
given situation, conclusions made on the basis of statistical considerations alone need to be tempered.
The committee began its evaluation presuming neither the existence nor the absence of association. It has sought to characterize and weigh the strengths and limitations of the available evidence. Subsequent chapters of the report summarize the evidence concerning each herbicide-health outcome relation under review and present the committee's conclusions. If the first question (Is there an association between herbicides, dioxin, or related compounds and a disease?) was answered affirmatively, and if available information permitted, the second question (If a relationship is assumed in any group of individuals, how likely is it that a relationship exists in Vietnam veterans?) was answered. The committee's task was not to judge individual cases of particular diseases or conditions, although, as described elsewhere in this report, the committee did learn of individual cases through public hearings and written testimony.
It should be noted that the committee's charge was neither to focus on questions of causation nor to focus on broader issues. Topics such as the potential costs of compensation for veterans afflicted with particular illnesses and policy regarding such compensation are not considered in this report. In addition, the committee makes no recommendations regarding individual cases. However, the report does provide scientific information for the Secretary of Veterans Affairs to consider in making determinations about compensation, but these decisions remain the responsibility of the Secretary. With this orientation to the committee's task and approach in mind, the following sections discuss the characteristics of the types of evidence that bear on the questions of association at hand.
CATEGORIES OF EVIDENCE
Experiments in Humans: Randomized Controlled Trials
Theoretically, the ideal method for assessment of causal relations (and thereby associations) between treatments and health outcomes is the randomized controlled trial because, when appropriate and feasible, it is the most scientifically rigorous method for testing such hypotheses. Randomized controlled trials are experiments in which subjects are randomly allocated, often in a masked fashion, into "treatment" and "control" groups, to receive or not to receive an intervention such as, in the present context, exposure to herbicides or dioxin; the control group receives an exposure to an inert substance (placebo) or an established alternative exposure; and both groups are followed in a strictly comparable manner to determine the relative frequencies of outcomes and diseases of interest. Although they are theoretically ideal, such trials are clearly not ethical or relevant in the case
of a potentially harmful exposure. Such experiments have not been done and thus they are not considered further.
Experiments in Animals: Animal Models
In principle, experimental studies in animals allow for both rigid control over herbicide or dioxin exposure and intensive observation of any health effect that may follow. If an animal model is to be considered valid for the study of a human disease, however, the manifestations of the disease should be similar in the two species. The starting point is generally what is currently known about the human disease. With respect to evaluation of dioxin in particular, and herbicides in general, the committee found a vast body of potentially relevant and somewhat controversial literature, which is discussed in Chapter 4. These animal studies helped the committee address one of its charges (i.e., the evaluation of biologic plausibility).
Controlled Epidemiologic Studies (Observational)
In contrast to randomized controlled trials and other experimental studies in humans, most epidemiologic investigations are "observational." This means simply that the occurrences of herbicide exposure and particular diseases are studied as they arise in the usual course of life and not under the conditions of a planned experiment.
Observational studies in populations are often "controlled," however, through various strategies of formal comparative investigation. For example, the experience of health outcomes in a group after exposure to dioxin can be compared with that in an unexposed control group. Alternatively, the prior dioxin exposure history of a group that has developed soft tissue sarcoma (STS) can be compared with that of a group free of this condition (unaffected control group). In these two strategies, the experience of the control or comparison group provides an estimate of the frequency either of disease in the absence of exposure or of exposure in the absence of the disease, as experienced in the general population. Although the contribution of the control group in such studies may seem analogous to that of the placebo group in a controlled trial, the analogy is a weak one. The lack of random assignment to "treatment" groups in observational (nonexperimental) studies makes the interpretation of such studies vastly more difficult than the interpretation of randomized clinical trials.
The most relevant types of such controlled, observational studies for the present review and their main characteristics are described in this section. Examples of studies related to herbicide or dioxin exposure and health outcomes serve for illustration.
Cohort studies track groups who are defined by common characteristics—including their exposure status, for example, Vietnam veteran or Vietnam era veteran—at the starting point of observation. The rates of occurrence of health outcomes are compared between these groups over time. All study participants are known or presumed to be free of the diseases under investigation at the start of the study. In the well-designed cohort study, reliable estimates of absolute disease rates in each group can be obtained. Especially for uncommon outcomes, such as cancer at specific sites (e.g., STS), large samples of participants, prolonged periods of observation, or both are required (Last, 1988). Such studies can provide evidence that bears on the first association question discussed earlier in this chapter. By dividing the rate in the exposed group by the rate in the nonexposed group, a measure of association termed the relative risk is derived, which provides a measure of the strength or magnitude of the association between an exposure and an outcome.
The starting point of the investigation can be either contemporaneous or in the past. In the first case, termed concurrent cohort studies , all observations, including both exposures and health outcomes, may be subject to direct observation by the investigator. In the second case, which typically depends on the availability of records of past exposures and health outcomes, the entire study may relate to experience prior to the start of the investigation. Such studies are termed historical or retrospective cohort studies. One potential limitation of these studies is the (lack of) completeness of records relating to the exposure of interest. Some features are common to both types of cohort studies, and others are distinct in accordance with their different temporal strategies.
Occasionally in cohort studies, especially in studies of occupational exposures, the investigators do not believe that a suitable unexposed cohort is available for comparison with the occupational cohort presumed to be exposed to a particular chemical or industrial process. In such cases, the comparison disease rates are usually drawn from a large external population, such as the population of males of a certain age in the United States. Using a technique termed "indirect adjustment" by epidemiologists, investigators calculate the number of cases of a particular disease that would be expected to occur in the observed cohort if that "exposed" cohort experienced the same disease rates as a matched group drawn from the comparison "unexposed" population. The expected number of cases is generally derived using age-, sex-, and calendar-year-specific rates of disease in the external population. The ratio of the actual observed number of cases to the expected number of cases is termed the standardized mortality (or morbidity) ratio, or SMR. The ratio is often, by convention, multiplied by 100 so
that a cohort in which the observed and expected numbers of cases are exactly equal would have an SMR of 100. An SMR greater than 100 would indicate an elevated risk of the disease in the exposed cohort. The committee has chosen to part with convention on the issue of the presentation of SMRs. In order to present the SMRs as measures of relative risk, the ratios have not been multiplied by 100 in this report.
A source of bias that is often a problem in studies with external comparison groups has been termed the "healthy worker effect" (Checkoway et al., 1989). The general population giving rise to the comparison rates contains people who are too ill to be members of the work force. Since exposed workers, by definition, have to have been healthy enough to be in the work force, they as a group may be healthier than the general population of the same age and sex. This bias would tend to dampen associations between exposure and disease. The bias is often especially evident when examining the SMR for chronic diseases such as heart disease or diabetes, but it is observed for cancers as well. The result of the healthy worker effect is to yield an underestimate of the true magnitude of association between exposure and disease. It is likely that the healthy worker effect applies to studies in which veterans are compared to nonveterans as well, and this possibility needs to be considered in interpreting some studies. The potential for this type of bias is far greater for studies of mortality than it is for studies of incidence. The healthy worker effect, like other biases, cannot be removed in the analysis of data but may be avoided with an appropriate study design. The comparison group should be selected to be similar to the study group in socioeconomic and employment status in order to avoid the healthy worker effect. Strictly speaking, the healthy worker effect is not limited to studies with external comparison groups—studies with internal comparison groups may be affected as well (Breslow and Day, 1987).
As an example of a historical cohort study of mortality (which, incidentally, is being continued as a prospective cohort study), the Air Force identified all members of Operation Ranch Hand (the "Ranch Hands")—the personnel involved in aerial herbicide spray operations in Vietnam during the period 1962-1971. The comparison group consisted of all Air Force veterans who concurrently were assigned to a variety of C-130 aircraft cargo units throughout Southeast Asia during the same period, but who were not occupationally exposed to herbicides (Michalek et al., 1990). This particular comparison group was chosen because of its large numbers and its similar training and psychological background to the Ranch Hands. It seems reasonable to imagine that health status at the time of enlistment was comparable in these two groups, so the healthy worker effect is not likely to introduce serious bias. There were 1,261 Ranch Hands who were compared
with 19,080 Air Force veterans, experiencing 91 and 1,241 deaths, respectively, as of December 31, 1989 (AFHS, 1991).
In the statistical analysis of the Ranch Hand mortality study, the investigators based results on the cumulative mortality [i.e., all deaths as of the end of 1989 (AFHS, 1991)]. The measure of increased risk used in this study was the SMR. The rates in the comparison group were used to generate the numbers of deaths that would be expected in the Ranch Hand group if the rates for the Ranch Hands were the same as those in the comparison group. This analysis should have been done by using actual rates and relative risks. Instead, the SMR was used because in the words of their report, it "appropriately treats the comparison population death rates as fixed rather than as unknown parameters. …" In other words, the sampling variability in the comparison group was ignored. The SMRs calculated were adjusted for differences between the groups in age and calendar year of observation, rank (officer versus enlisted), and occupation (flyer versus nonflyer).
The attempt to investigate STS in the study of Michalek and colleagues (1990) illustrates another feature of cohort studies generally: the fact that especially rare conditions may not be detectable within the limits of sample size and duration that characterize many such studies. The average annual age-adjusted incidence rate of STS for white males, cited by Kang and colleagues (1987), is 3.8 per 100,000. Thus, in a given year, slightly less than four cases of STS for every 100,000 people would be expected to be diagnosed. Even if the nearly 20 year period of observation for some individuals in the Ranch Hand study is considered, not even one case of STS would be expected among the 1,261 Ranch Hands. Because the outcomes in question are generally rare, the case-comparison or case-control design has more often been used in the investigation of health outcomes considered in this report.
Despite the rarity of many of the cancers considered by the committee, in a group of people as large as the group who served in Vietnam, some cases of rare diseases would be expected to occur even if herbicides used there had no deleterious effects. For this reason, in the sections of this report covering the basic epidemiology of various cancers, the committee has presented a calculation of the approximate number of cases of each disease that might occur in Vietnam veterans. Numerous assumptions are necessarily involved in such a computation, given the limitations of the available data. These assumptions, the limitations of the data, and the methods used for performing the calculations are explained in Chapter 8.
Another potential weakness of cohort studies is demonstrated by the study of Michalek and colleagues (1990). It is generally accepted that carcinogenesis can sometimes require a long period to produce clinically apparent disease: that is, there is sometimes a long period, known as the
latency period, between the exposure to even a known carcinogen and the appearance of clinical disease. Thus, for some cancers, exposure to the cancer-causing agent might not result in the appearance of disease for 5 to 20 years, or longer, depending on the disease. Conversely, cases of a particular cancer occurring in the cohort shortly after exposure are unlikely to be due to that exposure. This consideration is important in evaluation of the earlier studies of veterans, which may not have allowed sufficient latency for the development of disease to take place.
With respect to the issue of latency in the Ranch Hand study, consider a hypothetical subject. A Ranch Hand exposed to Agent Orange in 1969 at age 25, would be 45 years old in 1989, and 20 years would have elapsed since his exposure. This person would still be relatively young for STS to develop and might not have had enough time for an exposure-related cancer to develop. As another example, consider a possible association between exposure and prostate cancer. The rates for that disease are so low, given the relative youth of the Vietnam veterans' cohort (with respect to the age-specific rate of occurrence of prostate cancer), that there is virtually no possibility of observing an effect at this time.
Proportionate Mortality Studies
In some cohort studies, investigators have no accurate data on the composition of the cohort, but they do have access to sets of death records. The proportion of deaths due to each cause in a particular cohort is available, but not the actual mortality rates. This situation often leads to the conduct of a proportionate mortality study, in which comparisons are made between the proportions of deaths due to particular causes in the study cohort and in a presumably comparable cohort or in the general population. Occupational studies that utilize this comparison method often obtain information on causes of mortality from death certificates.
In analytical epidemiology, proportionate mortality studies are generally considered most valuable in the initial stages of an evaluation (Breslow and Day, 1987). They provide an inexpensive and rapid way of taking an early look at a set of data. Results of proportionate mortality studies must be interpreted carefully, since a proportionate excess can reflect either an excess in the absolute rate for the disease in question or a deficit in the absolute rates for some of the other causes. Large proportionate excesses, however, are unlikely to be produced in that way. Although a proportionate mortality study is usually considered a type of retrospective cohort study, it is convenient to consider the proportionate mortality study as equivalent to a case-control study in which the cases have died from the cause of interest and the controls are selected from deaths from all other causes (Breslow and Day, 1987; Miettinen and Wang, 1981).
The basic strategy in a proportionate mortality study is to compare the proportion of deaths due to a particular cause with the corresponding proportion in a (usually large) reference group. The ratio of the cause-specific proportion of deaths in the observed cohort to the proportion of deaths from that same cause in the reference population is known as the proportionate mortality ratio, or PMR. Since cause-specific death rates, and thus the proportions of deaths for different causes, are known to depend on age, the expected number of deaths from a particular cause is usually calculated on an age-specific basis and then summed across ages (Checkoway et al., 1989).
A major potential source of bias in proportionate mortality studies is the fact that some other causes of death may also be affected by the exposure, thus reducing the strength of the association between exposure and the cause of interest. It has been suggested, therefore, that one should exclude from the analysis causes of death that are also related to the exposure in question. If a chemical is thought to cause lung cancer, for example, lung cancer deaths should be excluded from a proportionate mortality study of the association between that chemical and other cancers. An extreme example would be an exposure that elevates the risk of every cause of death. In such a situation, none of the proportionate mortality rates might appear unusual, although all absolute mortality rates might be elevated relative to an unexposed group. The most appropriate analysis of proportional mortality data is to treat them as though they represent a case-control study in which the subjects who died of the cause of interest are the cases and those who died from other causes are the controls (Breslow and Day, 1987; Miettinen and Wang, 1981). The usual analytic methods for case-control studies (i.e., estimation of odds ratios) would then be applied.
An example of a PMR study is provided by the study by Bullman and colleagues (1990) of Army Vietnam veterans who served in Military Region I. This group, known as the I Corps, was believed for various reasons to have a relatively high potential for exposure to herbicides. Proportionate mortality in the I Corps was compared against the experience of 27,917 Army Vietnam era veterans who served in the military between 1965 and 1973, and had died as of the end of 1984. The principal finding was an excess of deaths from motor vehicle accidents. The study was analyzed by using proportionate mortality ratios, not as a case-control study. There was no excess mortality from either non-Hodgkin's lymphoma (NHL) or STS; however, there were only 10 deaths from STS compared with 11.4 expected. If the methods suggested above were applied, veterans who died of NHL and STS would be compared to the others who died to see if the NHL and STS cases were more likely to have been in I Corps. The question of latency is addressed somewhat in the analysis of this study, but there were very few expected deaths from NHL and STS in the longest latency (16+ years) category.
Controlled epidemiologic studies in which the subjects are selected on the basis of their disease status (e.g., with or without STS) and investigated to determine their prior histories of exposure, are termed case-control studies. Unlike cohort studies, case-control studies do not provide direct estimates of health outcome rates in the study groups because the groups are defined by the presence or absence of disease, and the cases are not ordinarily drawn from a defined population in a way that permits calculation of disease rates. Investigators generally determine the ratio of controls to cases in the study, thus making it impossible to calculate disease rates directly.
Instead, the result of the case-control study is expressed as the ratio of the odds of having been exposed as a member of the case group versus the odds of having been exposed as a member of the control or comparison group. When the cases and controls are defined and selected properly, the odds ratio may be thought of as an estimate of relative risk. When the disease of interest is uncommon, which is true for many of the diseases suspected of being related to herbicide exposure, the odds ratio will generally provide a reasonable estimate of the relative risk (Breslow and Day, 1980). As such, the odds ratio is a measure of association that can contribute to answering whether exposure is associated with the disease under consideration. If there is no association between exposure status and the disease, the expected odds ratio is 1. If, on the other hand, the risk of the disease is higher in the exposed group, even though the risk cannot be observed directly, the expected odds ratio is greater than 1. Because this strategy of investigation begins with the identification of cases, such as those in existing hospital or other medical records, it is not dependent, as is the cohort study, on the gradual accumulation of sufficient numbers of rare cases for analysis. Therefore, results can often be obtained in much less time and at a lower cost than they can with the cohort approach.
As examples, consider the Selected Cancers Study (SCS) performed by the Centers for Disease Control (CDC, 1990a-c). In a study of non-Hodgkin's lymphoma, cases were defined as all men first diagnosed between December 1, 1984, and November 30, 1988. The cases lived in the geographic regions covered by population-based cancer registries for five metropolitan areas. The ages were restricted to those individuals born between 1929 and 1953 who would have been eligible for service in Vietnam (ages 15-39 years in 1968).
In the Selected Cancers Study, the primary exposure variable was defined as military service in Vietnam or off the coast of Vietnam. The ''exposed" group is thus likely to contain many individuals who were not actually exposed to herbicides. The interpretation of a finding of no elevated risk for STS, for example, is therefore problematic. On the other
hand, an elevated odds ratio for NHL, for example, could be a result of some form of bias (see below) and might really represent no underlying elevation of risk due to herbicides or, in this case, due to military service. If the Selected Cancers Study was otherwise free of bias, an elevated odds ratio could theoretically represent an underestimate of the true odds ratio, if the only major problem were the misclassification of exposure status.
A key concern in interpreting the results of case-control studies such as the SCS is the potential for bias in the selection of cases and controls. Evaluation of an association depends on having valid estimates of the exposure frequencies in both the case and the comparison groups. An inappropriate case or control group may seriously bias the odds ratio. Several other important types of bias relate to the information about exposure collected from cases and controls. These sources of bias are discussed in Chapter 6 with exposure assessment issues. It is helpful to note, at this point, that these types of bias may exist in all populations, not just in veterans. Furthermore, bias is a result of a problem with the study design and not the study participants.
Case Reports and Case Series
The medical literature frequently contains reports of an individual or groups of individuals who have experienced a particular health problem and have also been exposed to some substance in the environment that is thought (by the exposed individual or his or her physician) to be responsible for the health problem in question. Information of this sort is also frequently available in individuals' health care records. Individual medical records have been presented to the committee, as were summaries of large numbers of individual cases. Thousands of such individual reports are available to the Department of Veterans Affairs (formerly the Veterans Administration) in the form of claims for compensation.
In some circumstances, such case reports can provide valuable information about an association between an environmental exposure and a particular health outcome. This is most likely when (1) the health outcome in question is both unusual and relatively specific for the exposure in question (i.e., the health outcome is unlikely to be the result of other exposures or causes); and (2) there is a close temporal relationship between the exposure and the health outcome (i.e., the outcome becomes apparent soon after, and not before, the exposure) (Kramer and Lane, 1992). Neither of these conditions is true for outcomes reported to the committee. Most of the outcomes thought to be associated with herbicide exposure did not become apparent until years or decades after the veteran returned from Vietnam. Most are relatively common outcomes in a cohort of individuals who are now generally 40 years of age or older. Although this does not mean that the outcomes
are not caused by herbicides, cases of this sort offer no positive evidence for a statistical association and were not considered by the committee except in special instances. In many of these reports, a physician has stated that the disease was caused by Agent Orange. The committee believes that such a determination is usually not possible to make in individual cases, and that statements of this sort reflect judgments based in part on the epidemiologic evidence that is the major subject of this report. Thus, since the purpose of this report is to evaluate the epidemiologic evidence, including data from individual cases would be circular reasoning.
Case series, that is, compilations of the range of diseases experienced by a group of individuals who report such problems to a central collection point, offer more promise for detecting an association between an exposure and a disease, but again only in special circumstances. In particular, if a group of individuals exposed to herbicides was diagnosed with a unique pattern of diseases and symptoms (common to all in the group, but in a combination that is rare in others not exposed to the herbicide), this might be taken as evidence of an association or causal relationship. However, the case series of which the committee is aware (see Appendix B) exhibit a wide range of health outcomes that are not at all specific to the group. If no unique pattern emerges, it must be shown that some outcomes are more common in the exposed group than in a comparable nonexposed group. In other words, an epidemiologic study of some type discussed above is necessary. Reports of health outcomes in veterans that are adequately compared to an appropriate control group have been considered by the committee, but case series without an appropriate comparison group were not regarded by the committee as having evidentiary value.
Information from Death Certificates
Data used in many studies involving mortality experience are obtained from death certificates; these studies tend to be conducted among occupational groups, and medical records are not always reviewed. Death certificates can provide occupation as well as cause of death information; the occupation as listed is used by investigators as a surrogate for potential exposures under investigation. However, there are methodologic concerns that need to be considered when dealing with death certificates. If occupation is obtained from the death certificate, the job listed may be the decedent's final job or usual job. When occupation is used as a surrogate for potential exposures, the usual or final job listed may not account for other jobs that involve the exposure of interest, especially if the possible exposure was years or decades before the time of death. Also, next of kin or family members may not be aware of the usual job held by the decedent. In most cases, "exposure" is underestimated if based on death certificate
information, and the magnitude of the association under consideration is attenuated.
Initial completion of death certificates for cause of death information may also lead to over- or underascertainment of the causes of interest. This is especially true for cancers that are difficult to diagnose and classify, such as STS (Sinks, 1993). The accuracy of cause of death data from death certificates also depends on a number of factors, such as whether an autopsy has been performed, the age of the deceased, whether the physician filling in the death certificate knows the deceased, the recording physician's attention to detail, avoidance of reporting a cause of death if "stigma" is attached, incorrect coding of disease, and variations in the quality of medical care in regions or over time. If death occurred by accidental means, underlying disease may or may not be evident and recorded. Thus, among a cohort of exposed or potentially exposed individuals, the effect of underascertainment of disease is most likely to bias the magnitude of the association under investigation toward the null.
Integration of Collective Results
Inferences concerning association are commonly made on the basis of epidemiologic and related biomedical evidence. Many policy decisions and practical actions are based on such inferences made by persons with widely varied backgrounds—professional and nonprofessional, technical and nontechnical. The process of reaching such conclusions is ordinarily personal and often private. By contrast, when this process is conducted formally in the manner of the present review, it is collective and interactive. As indicated earlier in this chapter, it is also desirable that reasoning be made explicit, in order that others may be enabled to evaluate the committee's conclusions independently.
Two aspects of the integration of evidence used by the committee are explained here. First is the quantitative approach of meta-analysis, whose principles and methods are discussed briefly below. A more complete discussion of the application of meta-analytic techniques to the evaluation of epidemiologic questions has recently been published (Dickersin and Berlin, 1992). In those instances in which it was deemed appropriate, this approach was applied and the results are presented in the corresponding section below. The quantitative meta-analyses were viewed by the committee only as supporting the more qualitative conclusions drawn from the second aspect. No decisions as to the adequacy of evidence favoring or not favoring a positive association were made solely on the basis of a quantitative meta-analytic result. Nevertheless, the fact that, in several instances, the meta-analysis was consistent with the qualitative conclusion, served to reinforce that conclusion.
The second aspect is the partially quantitative, but largely qualitative, process based on what is sometimes termed "causal inference," for which a general approach has long been recognized in the epidemiologic literature. Although the committee's primary charge was not the strict evaluation of causality, it was felt that considerations applied to causal inference were relevant to evaluation of the strength of the scientific evidence. This is also discussed below, to indicate the committee's view of that approach and its application to the present evaluation.
A review of the major studies on which this report is based suggests that the sample sizes of many studies are insufficient to detect important excess risks. As can be seen in Chapter 8, for instance, a number of the studies on cancer describe so few cases that large relative risks cannot be ruled out.
When a number of sufficiently similar studies of the same health outcome are available, it is sometimes possible to pool statistical information from the studies to develop an estimate of the relative risk, or odds ratio, of the outcome in question that is more precise than estimates from individual studies. An important consideration in the decision as to whether a meta-analysis is appropriate, is the degree of similarity of the component studies with respect to design features. Studies with vastly different definitions or levels of exposure might not be considered similar enough to be combined. On the other hand, even studies that appear to be designed similarly can often produce vastly different estimates of the relative risk. In this setting, meta-analysis can be used to explain the differences among study results on the basis of possibly subtle features of design, study populations, or statistical analysis.
The committee found a high degree of variability among epidemiologic studies with regard to the nature and extent of exposure to herbicides and dioxin, the specific health outcomes studied, and study design. After considering the studies available, the committee judged that data adequate for a formal meta-analysis were available only for several small groups of studies on particular cancers. The degree of variability (heterogeneity) of relative risks or odds ratios was tested statistically in all instances, and pooled estimates were obtained as weighted combinations of the individual estimates using the method developed by DerSimonian and Laird (1986).
Considerations in Assessing the Strength of Scientific Evidence
For each health outcome for which evidence indicated the presence of an association with herbicides or a related exposure, the committee assessed the applicability of each of six general considerations, patterned after those
attributed to Hill (1971). These criteria were originally proposed in the context of causal inference in chronic disease epidemiology. Strictly speaking, assessing causality was not the charge of this committee. However, in a more general sense, the criteria helped the committee evaluate the strength of the scientific evidence for or against associations between disease and herbicide exposure. Reflecting epidemiologic thought that had evolved over many years, Hill proposed the following guidelines for judgment: strength of association, dose-response relations, temporally correct association, consistency of association, specificity of association, and biologic plausibility. These considerations were applied, where possible, to aid interpretation in both directions.
Three of these considerations (strength of association, dose-response relation, and temporally correct association) can be applied to the findings of single studies and can therefore be regarded, in part, as measures of internal validity of the study design. Any of these considerations can be satisfied in some, but not necessarily all, studies testing a particular causal hypothesis. The other three considerations (consistency of association, specificity of association, and biologic plausibility) are not necessarily study specific and depend to varying degrees on prior knowledge.
Strength of Association
Strength of association is usually expressed in epidemiologic studies as the magnitude of the measure of effect, for example, relative risk or odds ratio. Generally, the higher the relative risk, the greater is the likelihood that the exposure-disease association is causal or, in other words, the less likely it is to be due to undetected error, bias, or confounding. Measures of statistical significance such as p-values are not indicators of the strength of association. Small increases in relative risk that are consistent across a number of studies, however, may also provide evidence of an association (see "Consistency of Association," below).
The existence of a dose-response relation—that is, an increased strength of association with increasing intensity or duration of exposure or other appropriate relation—strengthens an inference that an association is real. Conversely, the lack of an apparent dose-response relation does not rule out an association, as in the case of a threshold level of exposure beyond which the relative risk of disease remains constant and highly elevated. If the relative degree of exposure among several studies can be determined, indirect evidence of a dose-response relation may exist. For example, if studies of presumably low-exposure cohorts show only mild elevations in risk, whereas
studies of presumably high-exposure cohorts show more extreme elevations in risk, such a pattern would be consistent with a dose-response relation.
Temporally Correct Association
If an observed association is real, exposure must precede the onset of the disease by at least the duration of disease induction. The committee, in addition, considered whether the disease occurred within a time interval following herbicide exposure that was consistent with current understanding of its natural history. The committee interpreted the lack of an appropriate time sequence as evidence against association, but recognized that insufficient knowledge about the natural history and pathogenesis of many of the diseases under review limited the utility of this consideration.
Consistency of Association
Consistency of association requires that an association be found regularly in a variety of studies, for example, in more than one study population and with different study methods. The committee considered findings consistent across different categories of studies as being supportive of an association. Note that the committee did not interpret "consistency" to mean that one should expect to see exactly the same magnitude of association in different populations. Rather, consistency of a positive association was taken to mean that the results of most studies were positive and that the differences in measured effects were within the range expected on the basis of all types of error including sampling, selection bias, misclassification, confounding, and differences in actual exposure levels.
Specificity of Association
Specificity of association is the degree to which a given exposure predicts the frequency or magnitude of a particular outcome; if the association between the exposure and the health outcome is unique to both, a positive finding seems more strongly justified than when the association is nonspecific to both the exposure and the health outcome. The committee recognized, however, that perfect specificity could not be expected given the multifactorial etiology of many of the diseases under examination. In addition, the committee recognized the possibility that herbicides (or, more specifically, dioxin) might be associated with a broad spectrum of diseases.
Biologic plausibility is based on whether a possible causal association fits existing biologic or medical knowledge. Chapter 4 lays out the basic
scientific and animal evidence that the committee used to assess the biologic plausibility of an association. The existence of a possible mechanism, such as established carcinogenic potential based on animal studies (whether at the same tumor site as in humans or not), or known capacity to affect DNA, was thought to increase the likelihood that the exposure-disease association in a particular study reflects a true association. In addition, the committee considered factors such as (1) evidence in humans of an association between the exposure in question and diseases known to have similar causal mechanisms as the one in question; (2) evidence that certain outcomes (usually particular types of cancer) are commonly associated with occupational or environmental chemical exposures; and (3) knowledge of routes of exposure, storage in the body, and excretion that would suggest that some organs rather than others might be affected. Given the limitations of existing biological or medical knowledge, however, lack of specific biologic support for a given health outcome did not rule out a conclusion of sufficient evidence.
As noted above, it is important also to consider whether alternative explanations—error, bias, confounding, or chance—might account for the finding of an association. If an association could be sufficiently explained by one or more of these alternate considerations, there would be no need to invoke the several considerations listed above. Because these alternative explanations can rarely be excluded sufficiently, however, assessment of the applicable considerations listed above almost invariably remains appropriate. The final judgment is then a balance between the strength of support for the association and the degree of exclusion of alternatives.
The Role of Studies of Occupational and Environmental Exposures
Because the issues surrounding the measurement of actual exposure to herbicides in Vietnam are so complex, the committee also evaluated studies of environmental and occupational exposures. In many cases, the classification of exposure status in such studies was thought to be valid and well documented or assessed, particularly relative to such classifications in the studies of veterans. The improved assessment of exposure would lead to less bias. In addition, the actual levels of exposure to dioxin or related compounds in the occupational and environmental studies were generally higher than the exposure levels in many veterans and, in fact, were high in absolute terms, as well. Higher exposure levels would lead to an increased ability of occupational and environmental studies to detect effects of exposure. Primarily for these reasons, occupational and environmental studies
were taken under serious consideration in addressing the question of whether the compounds in question are associated with specific diseases.
NATURE OF THE CONCLUSIONS
This chapter has demonstrated that judgments about the possible association between health outcomes and exposure to herbicides, or related dioxin compounds, reflect both quantitative and qualitative reasoning. Some final observations will help to clarify the nature of the committee's conclusions.
Resolution refers to the fineness or sharpness of detail that can be discriminated by a particular mode of observation. In light microscopy, for example, observations are described by reference to the optical properties of the lens, such as 10×, 100×, or higher magnification. Electron microscopy, with very much higher resolution, distinguishes structural features not detectable with light microscopy.
Resolution in epidemiologic studies concerns the capacity of a study to discriminate the frequencies of health outcomes or exposures between groups in order to determine the presence or absence of associations. By analogy, resolution in epidemiology also depends in a sense on magnification, that is, on the order of magnitude of the numbers of participants—for example, from tens to hundreds of cases and controls in case-control studies, and from hundreds to thousands of exposed and unexposed subjects in cohort studies. With equally valid observations, results based on the experience of increasing numbers of persons, from single individuals to tens, hundreds, or thousands of individuals, provide successively greater resolution. Because of differences in the intrinsic nature of their designs, case-control studies generally require far fewer subjects than cohort studies do, for an equivalent degree of resolution. However, the principle still holds that increasing numbers of people, within a given type of study design, provide increasing resolution.
The resolution or discriminating capacity of epidemiologic studies could theoretically be increased indefinitely through ever larger study populations. However, there are many constraints on the feasibility of large studies. Rarity of exposures or events, or other circumstances, may limit the resolution even of large studies. For example, NHL and STS are both very rare, and workers showing increased rates of these diseases are likely to have been exposed to large amounts of dioxin. Such high exposures are not
common. Meta-analysis can, under the appropriate circumstances discussed above, be used to offset the limited size of individual studies, but the collective magnitude of the contributing studies may still be lower than desired. It should be emphasized that in all such studies the potential for bias is a key problem, and that enlarging the study, or combining the results of several studies in a formal meta-analysis, reduces only random error, not systematic error. Therefore, if bias is present, a firmer but still erroneous conclusion will result from a larger study (or meta-analysis) than from a smaller one.
Power calculations indicate the probability of achieving discrimination of a predetermined degree under the design of a given study. Power is thus a quantitative measure of the capacity of a study to achieve a given degree of resolution. In particular, it provides guidance against overconfidence in the absence of an association when a study with relatively low power has failed to demonstrate one. As such, they help reviewers appreciate the nature of the evidence about association.
As discussed earlier in this chapter, two types of error must be taken into account in designing and interpreting statistical tests. Epidemiologic studies are often designed to provide statistical tests that minimize type I error, the probability that the null hypothesis of no association is falsely rejected. Commonly, such tests are designed so that there is less than a 5 percent chance that the test will incorrectly indicate an association between a chemical exposure and a disease if no association truly exists. On the other hand, for any given test and sample size, there is some chance that the test will err in failing to find an association when one truly exists. This is called a type II error. The chance of making such an error increases when both the true excess risk and the sample size are small. From another perspective, given a particular sample size and a specified probability of a type I error, one can calculate the power of a test to detect an assumed association of a given magnitude. Because the power of a test is the opposite (technically, the complement) of the probability of making a type II error, the power of a test increases when both the true excess risk and the sample size are large.
For example, in CDC's Selected Cancers Study (1990b), the lack of a significant statistical association between soft tissue sarcoma and the Vietnam experience may reflect the small sample size (310 cases) rather than a true absence of association. In other words, if this investigation were replicated with more cases of STS than occurred in the CDC study, a statistically significant difference might be detected if there truly was an association.
Power calculations are also valuable in interpreting apparently conflicting results of multiple studies of the same exposure-disease combination. If findings of no association were concentrated in the low-power studies, for instance, the suggestion that no association exists would be weakened.
Uncertainty and Confidence
All science is characterized by uncertainty. Scientific conclusions concerning the result of a particular analysis or set of analyses can range from highly uncertain to highly confident. As discussed earlier in this chapter, the theoretical concept of proof does not apply in evaluating actual observations. In its review, the committee attempted to assess the degree of uncertainty associated with the results on which it had to base its conclusions.
For individual studies, confidence intervals around estimated results such as relative risks represent a quantitative measure of uncertainty. Confidence intervals present a range of results that, with a predetermined level of certainty, is consistent with the observed data. The confidence interval, in other words, presents a statistically plausible range of possible values for the true relative risk. When it is possible to use meta-analysis to combine the results of different studies, a combined estimate of the relative risk and confidence interval may be obtained.
For an overall judgment about an association between an exposure and a disease based on a whole-body of evidence, beyond the results of single studies or of meta-analyses, no quantitative method exists to characterize the uncertainty of the conclusions. Thus, to assess the appropriate level of confidence to be placed in the ultimate conclusions, it may be useful to consider qualitative as well as quantitative aspects.
Qualitative Aspects of the Review Process
An important aspect of the quality of a review such as the present one is comprehensiveness—to ensure against the possibility of any serious omission or inappropriate exclusion of evidence from consideration. If any such omission should be identified, a determination would be needed of whether its inclusion would likely affect the overall results and, if so, in what way.
In this report, the committee has documented in detail its approach to seeking and identifying the evidence to be reviewed (see Appendices A and B). Numerous parties were invited to supplement the materials already under review and to notify the committee of any recognized omissions of importance.
The phenomenon known as publication bias was also of concern to the committee. It has been well documented (Begg and Berlin, 1989; Berlin et al., 1989; Dickersin, 1990; Easterbrook et al., 1991; Dickersin et al., 1992) in biomedical research that studies with a statistically significant finding are more likely to be published than studies with nonsignificant results. Thus, evaluations of disease-exposure associations based solely on published
literature could be biased in favor of showing a positive association. Interestingly, this bias seems to be generated by the potential authors of papers, rather than by the editors of journals (i.e., it is the authors who choose not to submit papers). In two studies (Dickersin, 1990; Easterbrook et al., 1991), the likelihood of publication seemed to depend on the author's perception of the importance of the work, and this perception was related to the statistical significance of the findings. The committee did not consider the risk of publication bias to be high among studies of herbicide exposure and health risks for several reasons:
there were numerous published studies showing no positive association,
the committee did examine a fair amount of unpublished material, and
the committee felt that the publicity surrounding the issue of exposure to herbicides, particularly regarding Vietnam veterans, has been so intense that any studies showing no association would be unlikely to be viewed as unimportant by the investigators. In short, the pressure to publish such ''negative" findings would be extreme.
To ensure a fair weighing of all the evidence, neutrality is another important consideration in the quality of conclusions such as those presented by the committee. In this connection, the committee avoided the posture of the burden of proof approach, as discussed earlier in this chapter. The essential evidence, its main strengths and limitations, and the conclusions that follow are stated for each health outcome considered.
The evaluation of evidence to reach conclusions about statistical associations goes beyond quantitative procedures, at several stages: assessing the relevance and validity of individual reports; deciding on the possible influence of error, bias, or confounding on the reported results; integrating the overall evidence, within and across diverse areas of research; and formulating the conclusions themselves. These aspects of the review required thoughtful consideration of alternative approaches at several points. They could not be accomplished by adherence to a prescribed formula.
Rather, the approach described here evolved throughout the process of review and was determined in important respects by the nature of the evidence, exposures, and health outcomes at issue. Both the quantitative and the qualitative aspects of the process that could be made explicit were
important to the overall review. Ultimately, the conclusions expressed in this report about causation are based on the committee's collective judgment. The committee endeavored to express its judgments as clearly and precisely as the available data allowed.
SUMMARY OF THE EVIDENCE
The committee's specific mandate was to determine, if possible,
whether there is a statistical association between the suspect diseases and herbicide use, by taking into account the strength of the scientific evidence and the appropriateness of the methods used to detect the association;
the increased risk of disease among individuals exposed to herbicides during service in Vietnam; and
whether there is a plausible biologic mechanism or other evidence of a causal relationship between herbicide exposure and a disease.
The committee addressed the first part of this charge by categorizing each of the health outcomes under study into one of the four categories described below on the basis of the epidemiologic evidence that it reviewed. Considerations of biologic plausibility did not enter into the committee's decision about how to categorize these outcomes, but plausibility is discussed separately after the assessment of the epidemiologic evidence. The question of increased risk in Vietnam veterans is also addressed for each health outcome, subject to the considerations discussed below.
Categories of Association
The categories used by the committee were adapted from those used by the International Agency for Research on Cancer in evaluating the evidence for carcinogenicity of various agents (IARC, 1977). Consistent with the charge to the Secretary of Veterans Affairs in P.L. 102-4 (which is stated in terms of statistical association rather than causality) the distinctions between the categories are based on "statistical association," not on causality as is common in scientific reviews. The distinctions reflect the committee's judgment that a statistical association would be found in a large, well-designed epidemiologic study of the outcome in question in which exposure to herbicides or dioxin was sufficiently high, well-characterized, and appropriately measured on an individual basis.
Sufficient Evidence of an Association: Evidence is sufficient to conclude that there is a positive association. That is, a positive association has been observed between herbicides and the outcome in studies in which chance, bias, and confounding could be ruled out with reasonable confidence.
For example, if several small studies that are free from bias and confounding show an association that is consistent in magnitude and direction, there may be sufficient evidence for an association.
Limited/Suggestive Evidence of an Association: Evidence is suggestive of an association between herbicides and the outcome but is limited because chance, bias, and confounding could not be ruled out with confidence. For example, at least one high-quality study shows a positive association, but the results of other studies are inconsistent.
Inadequate/Insufficient Evidence to Determine Whether an Association Exists: The available studies are of insufficient quality, consistency, or statistical power to permit a conclusion regarding the presence or absence of an association. For example, studies fail to control for confounding, have inadequate exposure assessment, or fail to address latency.
Limited/Suggestive Evidence of No Association: There are several adequate studies covering the full range of levels of exposure that human beings are known to encounter, that are mutually consistent in not showing a positive association between exposure to herbicides and the outcome at any level of exposure. A conclusion of "no association" is inevitably limited to the conditions, level of exposure, and length of observation covered by the available studies. In addition, the possibility of a very small elevation in risk at the levels of exposure studied can never be excluded.
Increased Risk in Vietnam Veterans
The categories related to the association between exposure to chemicals and health outcomes, not to the likelihood that any individual's health problem is associated with or caused by the herbicides in question. As stated early in this chapter, the most desirable evidence as a basis for answering this type of question involves knowledge of the rate of occurrence of the event in those Vietnam veterans who were actually exposed to herbicides, the rate in those who were not exposed (the "background" rate of the event in the population of Vietnam veterans), and the degree to which any other differences between exposed and unexposed groups of veterans influence the difference in rates. When those Vietnam veterans who are actually exposed have not been properly identified, as has generally been the case in existing studies, this question becomes difficult to answer. Although there have been numerous health studies of American and other Vietnam veterans, most have been hampered by relatively poor measures of exposure to herbicides and/or dioxin and other methodological problems. Indeed, most of the evidence on which the findings in this report are based comes from studies of people exposed to dioxin or herbicides in occupational and environmental settings rather than from studies of Vietnam veterans.
The committee found the available evidence sufficient for drawing conclusions
about association between herbicides and health outcomes, but the lack of good data on Vietnam veterans per se, especially with regard to exposure, complicates the second part of the committee's charge, to determine the increased risk of disease among individuals exposed to the herbicides during service in Vietnam. By considering the magnitude of the association observed in other cohorts, the quality and results of the existing studies of veterans related to a particular outcome, and other principles of epidemiologic research discussed above, the committee formulated a qualitative judgment regarding the second question.
Air Force Health Study (AFHS). 1991. An Epidemiologic Investigation of Health Effects in Air Force Personnel Following Exposure to Herbicides. Mortality Update: 1991. Brooks AFB, TX: Armstrong Laboratory. AL-TR-1991-0132. 33 pp.
Begg CB, Berlin JA. 1989. Publication bias and dissemination of clinical research . Journal of the National Cancer Institute 81:107-115.
Berlin JA, Begg CB, Louis TA. 1989. An assessment of publication bias using a sample of published clinical trials. Journal of the American Statistical Association 84:381-392.
Breslow NE, Day NE. 1980. Statistical Methods in Cancer Research. Vol. 1, The Analysis of Case-Control Studies. Lyon: International Agency for Research on Cancer.
Breslow NE, Day NE. 1987. Statistical Methods in Cancer Research. Vol. 2, The Design and Analysis of Cohort Studies. Lyon: International Agency for Research on Cancer. Distributed by Oxford University Press.
Bullman TA, Kang HK, Watanabe KK. 1990. Proportionate mortality among U.S. Army Vietnam veterans who served in Military Region I. American Journal of Epidemiology 132:670-674.
Centers for Disease Control (CDC). 1990a. The association of selected cancers with service in the U.S. military in Vietnam. I. Non-Hodgkin's lymphoma. Archives of Internal Medicine 150:2473-2483.
Centers for Disease Control. 1990b. The association of selected cancers with service in the U.S. military in Vietnam . II. Soft tissue and other sarcomas. Archives of Internal Medicine 150:2485-2492.
Centers for Disease Control. 1990c. The association of selected cancers with service in the U.S. military in Vietnam. III. Hodgkin's disease, nasal cancer, nasopharyngeal cancer, and primary liver cancer. Archives of Internal Medicine 150:2495-2505.
Checkoway H, Pearce N, Crawford-Brown DJ. 1989. Research Methods in Occupational Epidemiology. Monographs in Epidemiology and Biostatistics, Vol. 13. New York: Oxford University Press.
DerSimonian R, Laird N. 1986. Meta-analysis in clinical trials. Controlled Clinical Trials 7:177-188.
Dickersin K. 1990. The existence of publication bias and risk factors for its occurrence. Journal of the American Medical Association 263:1385-1389.
Dickersin K, Berlin JA. 1992. Meta-analysis: state-of-the-science. Epidemiologic Reviews 14:154-176.
Dickersin K, Min YI, Meinert CL. 1992. Factors influencing publication of research results: follow-up of applications submitted to two institutional review boards. Journal of the American Medical Association 267:374-378.
Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. 1991. Publication bias in clinical research. Lancet 337:867-872.
Hill AB. 1971. Principles of Medical Statistics. 9th ed. New York: Oxford University Press.
Institute of Medicine (IOM). 1991. Adverse Effects of Pertussis and Rubella Vaccines. Washington, DC: National Academy Press.
International Agency for Research on Cancer (IARC). 1977. Some Fumigants, The Herbicides 2,4-D and 2,4,5-T, Chlorinated Dibenzodioxins and Miscellaneous Industrial Chemicals. IARC Monographs on the Evaluation of the Carcinogenic Risk of Chemicals to Man, Vol. 15. Lyon: IARC.
Kang HK, Enzinger FM, Breslin P, Feil M, Lee Y, Shepard B. 1987. Soft tissue sarcoma and military service in Vietnam: a case-control study. Journal of the National Cancer Institute 79:693-699 (published erratum appears in Journal of the National Cancer Institute 1987, 79:1173).
Kramer MS, Lane DA. 1992. Causal propositions in clinical research and practice . Journal of Clinical Epidemiology 45:639-649.
Last JM, ed. 1988. A Dictionary of Epidemiology. 2nd ed. New York: Oxford University Press.
Michalek JE, Wolfe WH, Miner JC. 1990. Health status of Air Force veterans occupationally exposed to herbicides in Vietnam. II. Mortality. Journal of the American Medical Association 264:1832-1836.
Miettinen OS, Wang J-D. 1981. An alternative to the proportionate mortality ratio. American Journal of Epidemiology 114:144-148.
Sinks T. 1993. Misclassified sarcomas and confounded dioxin exposure. Epidemiology 4:3-6.