This chapter presents the approach and methods that the Volume 10 committee used to identify and evaluate the scientific and medical literature on Gulf War veterans and the process used to reach the conclusions on the association between deployment to the Gulf War and a given health outcome. It provides information on how the committee searched the literature, and what evaluation criteria were used to screen and categorize the literature. The categories of association that were used to draw conclusions about the possible health effects that might result from being deployed are then presented. Finally, the committee describes some of the issues it encountered when considering the literature on Gulf War exposures and health outcomes such as multiple exposures and individual variability. Because the Volume 10 committee closely followed the approach used by prior Gulf War and Health committees, particularly Volumes 4 and 8, much of the following information was previously described in Chapter 2 of Volume 8 (IOM, 2010).
The Volume 10 committee, while tasked with updating the earlier Volumes 4 and 8, also was asked to comprehensively review, evaluate, and summarize the scientific and medical literature on health effects seen in 1990–1991 Gulf War veterans. The committee began its work by overseeing extensive searches of the scientific literature, including published articles, other reports, and government documents that had been published since the last literature search conducted for Volume 8 in 2009. The updated search retrieved more than 280 studies of potential relevance to this report. Studies that did not appear to have immediate relevance, based on an assessment of the title and abstract by committee members and the Institute of Medicine (IOM) staff, were deleted from further consideration. Deleted studies included, but were not limited to, case reports, studies of civilians in the Persian Gulf region, treatments, short-term or acute health outcomes, rehabilitation, social outcomes (for example, employment), impacts on families, or studies of long-term outcomes from known physical events, such as gunshot wounds. Removal of these studies yielded 76 potentially relevant studies to be considered and discussed by the committee. The studies obtained as full text were objectively evaluated by the committee members without pre-
conceived ideas about what health outcomes might occur and what, if any, associations might be found between being deployed to the Gulf War and any health condition.
As discussed in more detail in Chapter 4, “Evaluation of Health Conditions,” many Gulf War veterans have experienced a multitude of symptoms—commonly including fatigue, cognitive impairments, and chronic pain—that collectively has been termed Gulf War illness. Considerable research has focused on trying to identify what exposures might have caused Gulf War illness; a variety of causative exposures have been proposed, including pyridostigmine bromide (PB), pesticides, and the nerve agent sarin, individually or in some combination. Because this is the last volume in the Gulf War and Health series, the committee considered it prudent to look at the new literature on Gulf War illness, particularly animal models that attempt to simulate the multiple exposures experienced by Gulf War veterans during deployment. Therefore, an additional literature search was conducted to identify toxicologic and other animal studies that specifically sought to reproduce Gulf War exposures or describe a model of Gulf War illness. Because no prior IOM committee had conducted a comprehensive review of multiple Gulf War exposures but rather assessed exposures on a chemical-by-chemical or group of chemicals (e.g., carbamate pesticides) basis, this literature search was not limited by publication date.
The search of toxicologic literature identified more than 100 papers. Committee members with the most experience in toxicology reviewed all titles and abstracts and selected 50 papers for further consideration. The selection criteria for those papers were that a study must have employed exposure levels and durations that were potentially representative of those of Gulf War veterans and that the studies must have included more than one chemical or other exposure; some studies looked at combinations of up to four exposures, such as PB, DEET, stress, and sarin, or PB, permethrin, and chlorpyrifos. The animal literature was reviewed by the committee’s toxicologists and presented by them at the plenary sessions for discussion by the full committee.
The Volume 10 committee adopted a policy of using only published literature that had undergone rigorous peer review as the basis of its conclusions. While the process of peer review by fellow professionals increases the likelihood that high-quality studies will appear in the literature, it does not guarantee the validity of any particular study or the ability to generalize its findings. An exception to this policy was the inclusion of government reports and one government presentation.
The committee focused on epidemiologic studies in this report because epidemiology deals with the determinants, frequency, and distribution of disease in human populations rather than in individuals. In Chapter 4, “Evaluation of Health Conditions,” three types of epidemiologic studies were used to support the committee’s conclusions: cohort studies (including mortality studies), case-control studies, and cross-sectional studies. Volume 8 describes these major types of epidemiologic studies, the limitations inherent in conducting and using epidemiologic studies, how the committee used epidemiologic studies to make associations between deployment to the Gulf War and health conditions, and considerations in inferring causality from the available data. Case reports were not included in this report as the committee believes that while such reports may be interesting and provide unusual information, it is not appropriate to extrapolate findings from a single case to an entire population such as Gulf War veterans. Box 2-1 provides brief definitions of some of the terms used in the epidemiologic studies considered by this committee. The strength of an association between exposure and condition is generally estimated quantitatively by using prevalence ratios, relative risks, odds ratios, correlation coefficients, or hazard ratios depending on the epidemiologic design used. A ratio greater than 1.0 indicates that the outcome variable has occurred more frequently in the exposed group, and a ratio less than 1.0 indicates that it
has occurred less frequently. Ratios are typically reported with a confidence interval (CI) to quantify random error. Statistical significance may be represented by a CI or a p-value. If the 95% CI for a risk estimate (such as an risk ratio or odds ratio) includes 1.0, the association is not considered to be statistically significant; however, if the interval does not include 1.0, the association is said to be statistically significant with an alpha error (likelihood that the association is due to chance) of 5% (that is, p < 0.05).
Committee members having the most familiarity with a particular health condition or expertise (e.g., neuroimaging, genetics, and toxicology) reviewed all titles and abstracts and identified papers for full text retrieval. Initially, the Volume 8 health conditions were used as the basis for the health conditions
to be included in this volume, although other outcomes would be included if suggested by the literature. One or more committee members then conducted a preliminary review of the relevant studies, including the studies cited in Volume 8, to determine if an individual study met the inclusion criteria for a primary or secondary study (see below) for a particular health condition. Each study was read critically and reviewed for its relevance and quality. The responsible committee member(s) then presented the information from the preliminary screening and categorization to the full committee for discussion.
Typically, the information included a review of the Volume 4 and Volume 8 studies and conclusions, the methods used for selecting and evaluating the new study populations, the study results, and the committee member’s assessment of the strengths and limitations of the study. Each primary and secondary paper was discussed for each health condition. Some of the larger cohort studies used a variety of methods and instruments to assess the health status of Gulf War veterans and it is for this reason that the committee discussed at some length the diagnostic approaches and use of self-reports for each paper. Because of the variability in the description and diagnosis of the health conditions considered in this report, the committee made no a priori assumptions about the usefulness of any paper for a health outcome; each paper was discussed individually for each health outcome (several papers assessed numerous health outcomes). The committee reviewed the studies in this report with a view to considering the level of random error, the potential for bias, as well as the authors’ strategies for examining and/or limiting the effect of each on the study findings. Common types of biases found in the Gulf War veteran literature are identified in Box 2-2. For greater detail about epidemiology, biases, confounders and other methodological background, see Chapter 2 of Volume 8.
After reviewing the updated literature, the committee agreed that some reorganization of the health conditions was warranted for this volume; for example, women’s health conditions were no longer considered in a separate section but rather are addressed in the section on each relevant health condition, and several conditions were combined into one section, such as the inclusion of musculoskeletal disorders, fibromyalgia, chronic fatigue syndrome, and chronic widespread pain in one section on pain-related disorders.
After the studies had been discussed in plenary session, the responsible committee member(s) drafted the text for that health outcome. The evidence tables were revised to include the new primary studies as well as the primary studies from Volumes 4 and 8; secondary studies were not included in the evidence tables. If there were no new primary studies for a health outcome, the evidence table from Volume 8 was included but not updated. Data and units are presented as reported in the cited studies, except where otherwise noted. The committee did not collect original data, nor did they perform any secondary data analyses such as meta-analyses.
The draft text was reviewed and discussed in further plenary sessions until all committee members reached a consensus on the description of the studies and the conclusions for each health outcome. After this language was agreed upon, the full committee assigned a category of association (discussed later in this chapter) based on the weight of the evidence (including the studies cited in Volume 4 and Volume 8, as well as any new studies) and expert judgment. It should be noted that the committee did not use a formulaic approach as to the number of primary and secondary studies that would be necessary to assign a specific category of association. Rather the committee’s review required a thoughtful and nuanced consideration of all the studies as well as expert judgment, and this could not be accomplished by adherence to a narrowly prescribed formula of what data would be required for each category of association or for a particular health outcome.
Although the majority of the studies considered by the committee for this report were epidemiologic, other types of studies—such as animal toxicology, neuroimaging, functioning, and genetics—were assessed. The committee did not evaluate studies of acute trauma, rehabilitation, medical treatment, or
transient illness. The committee also did not consider health outcomes seen in veterans of conflicts other than the Gulf War unless those veterans formed an appropriate control group (for example, veterans who had served in Bosnia).
Primary studies provide the basis for the committee’s findings. For a study to be included in the committee’s review as primary it had to meet specified criteria. It had to be published in a peer-reviewed journal or other rigorously peer-reviewed publication, such as a government report or monograph. It also
needed to (1) include sufficient detail to demonstrate rigorous methods (for example, had a control or reference group, and adjusted for confounders when needed); and (2) include information regarding a persistent health outcome, and use appropriate laboratory testing, if applicable. Furthermore, a primary study was considered to be generalizable to and representative of the Gulf War veteran population. Although the responsible committee member initially presented his or her determination of whether a study met the criteria, the committee discussed the study’s methods and results using the inclusion criteria at some length before agreeing as to whether the study should be classified as primary.
For medical conditions that have no morphological features, the use of validated symptom criteria, such as those of the Rome Foundation for irritable bowel syndrome, are preferred over reports of medical symptoms or group of symptoms. As noted earlier, for a study to be considered primary, it needed to have an independent assessment of an outcome rather than self-reports of an outcome or reports by family members, even if the self-report was of a “doctor-diagnosed” illness. Health conditions should have been diagnosed or confirmed by a clinical evaluation, imaging, hospital record, or other medical record. For psychiatric conditions, standardized interviews were preferred, such as the Structured Clinical Interview for the DSM-IV-TR (Diagnostic and Statistical Manual of Mental Disorders-IV-TR), the Diagnostic Interview Schedule, and the Composite International Diagnostic Interview. Similarly, for neurocognitive conditions, standardized and validated tests were preferred. Additionally, the condition had to be diagnosed after deployment and have an appropriate follow-up or latency period for the development of the health effect. The committee notes that the diagnostic criteria for and definitions of many of the health conditions considered in this report, such as mental health disorders, fibromyalgia, and chronic fatigue syndrome, have evolved over time and will continue to do so. However, the new definitions and criteria have not been applied to the studies discussed in this volume.
Studies reviewed by the committee that did not necessarily meet all the criteria of a primary study were considered secondary studies. Secondary studies are typically not as methodologically rigorous as primary studies, or they might present findings of altered functioning consistent with later development of a diagnosis but without clear predictive value. Many of the secondary studies relied on self-reports of various diagnoses rather than an examination by a health professional or a medical record review.
As self-reports of health outcomes and exposures account for the bulk of the Gulf War and health literature, the committee decided that it would not exclude such studies but rather considered them to be secondary. The committee recognized the potential for misclassification of a health outcome due to inaccurate recall in such studies. As explained later in this chapter, self-reports of exposure information are also subject to recall bias, and thus in studies where participants self-report both their exposures and their health conditions, there is a greater potential for reporting bias as a result of participants over-reporting both pieces of information. Furthermore, the committee recognizes that self-reports may also be influenced by media attention and potential compensation for service-connected conditions, however, none of the studies reviewed by the committee factored these considerations into their evaluations.
Other Related Studies
The Volume 10 committee also included a number of studies that while they did not meet the criteria for a primary or secondary study, nonetheless provided information on the health of Gulf War veterans. Examples of such studies might be those that looked at the health outcome of interest in veterans and its association with another health outcome in veterans, that is not a comparison between deployed and
nondeployed veterans. For example, a study may assess whether cardiovascular conditions in deployed veterans are associated with their alcohol use. Other studies focused on neuroimaging approaches to detect changes in the brains of veterans with Gulf War illness or on the identification of human genotypes that may be markers for the diagnosis or treatment of Gulf War illness. These studies and other relevant studies were also reviewed by an expert on the committee and then discussed by the full committee. In an effort to be inclusive, these ancillary studies are discussed in a section called “Other Related Studies” for each health outcome to which they pertain; however, no conclusions were based solely on these ancillary studies.
Differences among people in their genetic, biologic, psychologic, and social vulnerabilities add to the complexities in determining health outcomes related to specific agents. The likelihood of observing a particular health outcome may differ for people with increased sensitivity to an agent. A person who is a poor metabolizer of a particular substance, depending on his or her genetic makeup, might be at higher or lower risk for specific health effects if exposed to the substance. For example, researchers are investigating butyrylcholinesterase enzyme levels and genotypes in veterans with and without Gulf War illness to try to determine whether a specific enzyme genotype puts a veteran at higher risk of developing the illness if they used PB during the war (Steele et al., 2015).
Another aspect of individual variability can be a residual genetic confounder (Khoury and Yang, 1998), in which an unassessed but potentially measureable genetic factor could be associated with the exposure and outcome of interest; such confounding may have a particularly important role in smaller case-control studies. For example, an exposure to Gulf War deployment found to be associated with an outcome of interest could instead have a genetic residual confounder (not otherwise assessed) explaining the relationship. Genetic risk markers are often differentially represented in various populations, especially across race-ethnicities (and even within seemingly uniform groups), and are thus important to consider in certain diseases, especially those with known causal associations related to highly or fully penetrant genes.
Chapter 1 and earlier volumes of the Gulf War and Health series describe the possible exposures Gulf War military personnel might have experienced. Volume 4 also details the exposure modeling and biological monitoring that was conducted by the Department of Defense (DoD) and others to estimate troop exposures to some chemical agents such as depleted uranium (DU), sarin, and cyclosarin, and smoke from oil-well fires. As noted in that chapter, there is poor agreement between subjective and objective measurements of exposures to depleted uranium and oil-well fire smoke. Some studies also show evidence of reporting bias regarding vaccinations and ingestion of pyridostigmine bromide tablets. The modeling of the possible exposures to sarin and cyclosarin from the demolition of the Khamisiyah complex has also been criticized. The committee did consider studies that compared health outcomes seen in deployed veterans who may or may not have been exposed to nerve agents as a result of the Khamisiyah detonation and to oil-well fire smoke; some of these studies also included nondeployed control groups.
Limitations of Exposure Information
Very little is known about most Gulf War exposures. After the ground war, an environmental-monitoring effort was initiated primarily because of concerns related to smoke from oil-well fires and exposure to sarin and cyclosarin. Monitoring for the other agents to which the service members might have been exposed was not conducted. Consequently, exposure data (such as the actual agents, the duration of exposure, the route of entry, measures of external exposure (e.g., air concentrations), the internal dose, and documentation of adverse reactions) on those other agents are lacking or severely limited. For example, a DoD-sponsored post-war survey conducted by RAND Corporation assessed possible exposures to pesticides—both those condoned or supplied by the military and those obtained from nonmilitary sources (Fricker et al., 2000). The survey found that the majority of service members used some form of pesticide, in many cases multiple pesticides, and that some troops may have over or misused pesticides during their deployment.
Various exposure assessment tools have been used in research to fill gaps in exposure information, but there are problems in reconstruction of past exposure events. Many studies have assessed military personnel exposures to various preventive agents including PB and pesticides agents during the Gulf War. These studies have been based on an individual’s recall of the agents he or she received or took, frequently in stressful situations. Recall information has rarely been verified by in situ measurements or examination of military records. Furthermore, many of the surveys used to assess potential exposures were conducted several to many years after the war was over (Fricker et al., 2000), such as the National Health Survey of Gulf War Veterans and their Families conducted in 1995 (Kang et al., 2000). For example, veterans have been surveyed to obtain recollections about agents to which they might have been exposed, although survey results might be limited by recall bias and even a lack of familiarity with what the potential exposures were, such as the names of pesticides (Fricker et al., 2000).
Extensive efforts have been made to model and obtain information on potential exposures to DU, smoke from oil-well fires, and other agents. Models have been refined to estimate exposures to sarin and cyclosarin, but it is difficult to incorporate intelligence information, meteorologic data, transport and dispersion data, and troop-unit location information accurately (see Volume 4, Chapter 2, “Exposures in the Persian Gulf”). Although modeling efforts are important for discerning the details of exposures of Gulf War veterans, they require external review and validation. Furthermore, even if there were accurate troop location data, the location of individual service members would be very uncertain. Because of the limitations in the exposure data, it is difficult to determine the likelihood of increased risk for disease or other adverse health effects in Gulf War veterans that are due specifically to biologic and chemical agents.
Multiple Exposures and Interactions
Compounding the difficulty in assessing deployment exposures is the fact that military personnel were potentially exposed to numerous harmful agents simultaneously and sequentially. Many of the exposures were not specific to the Gulf War (e.g., diesel and solvents), although others were (e.g., PB, nerve agents), but the number and combination of agents to which the veterans might have been exposed make it difficult to determine whether any specific agent or combination of agents is the cause of many of the Gulf War veterans’ illnesses. As noted by the Volume 6 committee (IOM, 2008b) on deployment-related stress, most the studies that assessed Gulf War exposures queried veterans about a prescribed list of exposures that the investigators thought the veterans might have experienced. Few of the studies asked open-ended questions about what the exposures or conditions were, nor did they ask about the frequency, intensity, or duration of those exposures. Furthermore, although some exposure surveys were
conducted shortly after the conclusion of the war (e.g., Sutker et al., 1993), other surveys were conducted several and even many years after the war (e.g., Proctor et al., 1998), making it difficult to determine the accuracy of the veterans’ recall of their exposures during a stressful period and after a substantial lapse of time. The Volume 10 committee also recognizes that at 25 years after the war, continuing to survey veterans about their exposures during the war is unlikely to yield any new exposure information or help to model those exposures.
Addressing multiple exposures has been a subject of much debate and research in the fields of toxicology, environmental and occupational health, and medicine, and although much progress has been made, the interactions of harmful agents in both humans and animals are not well understood. Exposure to multiple agents or stressors may result in interactions among the agents that produce an effect that might not have occurred otherwise, or they might result in a greater effect, or in a different effect than that caused by exposure to one of the agents alone. Thus, attributing an effect to one agent or a combination of agents is difficult, especially in situations involving many different types of exposures (e.g., chemicals, heat, psychological stress, vaccines). The complexity of assessing the effects of multiple exposures and interactions is further compounded by the uncertainty of knowing what exposures a given service member experienced, nor their frequency, intensity (e.g., concentration), or duration.
The committee attempted to express its judgment of the available data clearly and precisely in the “Conclusions” section for each health outcome. It agreed to use the categories of association that have been established and used by previous Gulf War and Health committees and other IOM committees (IOM, 2000, 2003, 2005, 2006b, 2007, 2010). Those categories of association have gained wide acceptance by Congress, government agencies (particularly the Department of Veterans Affairs), researchers, and veterans groups.
The five categories below describe different levels of association and present a common message: the validity of an association is likely to vary to the extent to which common sources of spurious associations could be ruled out as the reason for the observed association. Accordingly, the criteria for each category express a degree of confidence based on the extent to which sources of error were reduced. The committee discussed the evidence and reached consensus on the categorization of the evidence for each health outcome in Chapter 4.
Sufficient Evidence of a Causal Relationship
Evidence is sufficient to conclude that a causal relationship exists between being deployed to the Gulf War and a health outcome. The evidence fulfills the criteria for sufficient evidence of a causal association in which chance, bias, and confounding can be ruled out with reasonable confidence. The association is supported by several of the other considerations used to assess causality: strength of association, dose–response relationship, consistency of association, temporal relationship, specificity of association, and biologic plausibility.
Sufficient Evidence of an Association
Evidence suggests an association, in that a positive association has been observed between deployment to the Gulf War and a health outcome in humans; however, there is some doubt as to the influence of chance, bias, and confounding.
Limited/Suggestive Evidence of an Association
Some evidence of an association between deployment to the Gulf War and a health outcome in humans exists, but this is limited by the presence of substantial doubt regarding chance, bias, and confounding.
Inadequate/Insufficient Evidence to Determine Whether an Association Exists
The available studies are of insufficient quality, validity, consistency, or statistical power to permit a conclusion regarding the presence or absence of an association between deployment to the Gulf War and a health outcome in humans.
Limited/Suggestive Evidence of No Association
There are several adequate studies, covering the full range of levels of exposure that humans are known to encounter, that are consistent in not showing an association between deployment to the Gulf War and a health outcome. A conclusion of no association is inevitably limited to the conditions, levels of exposure, and length of observation covered by the available studies. In addition, the possibility of a very small increase in risk at the levels of exposure studied can never be excluded.
Complicating the issue of Gulf War veterans’ health is the inexorable march of time. Aging itself is associated with an increased prevalence of health conditions even among people who have never been in the military or deployed. The aging of the Gulf War veteran population—now at least 43 years old—makes it difficult in many cases to distinguish health outcomes associated with the Gulf War, such as the joint pain and cognitive deficits that are symptomatic of Gulf War illness, from those that occur normally with increasing age. Furthermore, many diseases that are more common in older adults, such as some cancers, have long latencies of onset that also complicates the identification of a possible cause of the disease be it Gulf War deployment, aging, or some other event. This distinction is important because it is possible that exposure during deployment when the veteran was relatively young may increase the likelihood of later age-related disease. However, the challenge of understanding the effects of aging can be addressed with well-designed epidemiologic studies that select appropriate reference populations and apply rigorous analytical methods.
The epidemiologic and clinical studies conducted to date have provided valuable information regarding the health of Gulf War veterans; however, many of the studies have significant limitations of design or implementation that hinder accurate assessment of the veterans’ health status. The limitations include the possibility that study samples do not represent the entire Gulf War population, low rates of participation in studies, narrow assessment of health status, reinforcement of self-reporting of symptoms and exposures in response to media attention, insensitivity of instruments for detecting abnormalities in deployed veterans, and a period of investigation that is long past the time of exposure. For some studies, particularly those conducted in the decade or so after the war, the period of investigation may have been too brief to detect health conditions that have a long latency or require many years to progress to the point where disability, hospitalization, or death occurs. Many of the U.S. studies are cross-sectional, and this limits the opportunity to learn about symptom duration and chronicity, latency of onset, and prognosis. In addition, the problem of multiple comparisons that is common in many of the Gulf War
studies results in confusion over whether the effect is real or occurring by chance. These limitations make it difficult to interpret the results of study findings, particularly when several well-conducted studies produce inconsistent results.
The committee’s process for reaching conclusions about the strength of the association between deployment to the Gulf War and its potential for adverse health outcomes was collective, interactive, iterative, and based on the process used by the Volume 8 committee. The committee thoroughly evaluated the scientific literature attending to the design, methodological, and special considerations described above, particularly the limitations of exposure information and outcomes assessment. The evaluation process as implemented by the committee ensured a rigorous review and clear response to the committee’s charge.
This page intentionally left blank.