Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
4 CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE This chapter presents the approach that the committee used to identify and evaluate the literature on traumatic brain injury (TBI). It provides information regarding how the committee searched the literature and discusses the major types of studies considered. The chapter also includes a discussion of the committeeâs evaluation criteria, the limitations of the studies reviewed, and the categories of association that the committee used in drawing conclusions about association. As noted in Chapter 1, the committee was charged with drawing conclusions about the association between TBI and subsequent long-term health outcomes. The legislation originally establishing this type of review (PL 105-277 and PL 105-368) does not direct the committee to look at specific diseases or outcomes but rather to look broadly for health effects that might be associated with the exposure under study. In this instance, the exposure of interest is a TBI. The committeeâs literature searches led to the numerous health outcomes that are discussed in the report. Thus, the committee sought to characterize and weigh the strengths and limitations of the available evidence as presented in the studies it reviewed. The committee did not concern itself with policy issues, such as potential costs of compensation, policies regarding compensation, or any broader policy implications of its findings. IDENTIFICATION OF THE LITERATURE The committee began its work by overseeing extensive searches of the peer-reviewed medical and scientific literature, including published articles, other peer-reviewed reports, and dissertations. The searches retrieved over 30,000 potentially useful epidemiologic studies, and the titles and abstracts of those studies were reviewed. The committee focused its attention on clinical and epidemiologic studies of adults with long-term health effects that resulted from a TBI by any mechanism, such as occupational injury, motor-vehicle collision, sports injury, gunshot wound, or other act of violence, including military combat injury. Studies of patients with a TBI due to malignancy, stroke, infection, ischemia, other diseases or disorders of the brain, intoxication, or oxygen deprivation were not considered. The committee did not systematically review studies of young children, the elderly, or brain-injured patients in litigation for compensation claims. The review excluded case reports, case series with few participants, 103
104 GULF WAR AND HEALTH and studies of acute outcomes that resolved within days to a few months. The committee did not review general studies of âdisabilityâ as a gross measure of morbidity but rather evaluated studies that associated TBI with specific health outcomes. After its assessment of the 30,000 titles and abstracts, the committee members identified about 1,900 studies for further review. Those studies were objectively evaluated without preconceived ideas about health outcomes or the existence or absence of associations. To assist them in their evaluation, the committee members developed inclusion criteria (see below) to determine which of the 1,900 studies would be included in its review. The committee adopted a policy of using only peer-reviewed published literature or unpublished reports that had undergone rigorous peer review, such as dissertations and some government reports, as the basis of its conclusions. The process of peer review by fellow professionals increases the likelihood of high quality but does not guarantee the validity of a study or the ability to generalize its findings. Accordingly, committee members read each study critically and considered its relevance and quality. They did not collect original data, nor did they perform any secondary data analysis. In light of that orientation to the committeeâs task and approach, the following section briefly discusses types of evidence and the value of epidemiologic or clinical studies in determining whether an association exists. It is followed by a discussion of the committeeâs specific inclusion criteria that were developed to help in deciding whether a study would be included and evaluated. The committee also notes the numerous factors that it considered in evaluating the evidence in a study and, finally, presents the categories of association used in drawing conclusions about the strength of associations. TYPES OF EVIDENCE The committee relied entirely on clinical and epidemiologic studies to draw its conclusions about the strength of evidence of associations between TBI and health effects. However, animal studies play a critical role in clarifying the mechanism of TBI (see Chapter 2) and in providing biologic understanding of many of the effects seen in humans. Animal Studies Studies of laboratory animals are essential for understanding mechanisms of action and biologic plausibility and for providing information about possible health effects when experimental research in humans is not ethically or practically possible (NRC, 1991). Such studies permit an injury caused by a blast or other mechanism to be introduced under conditions controlled by the researcher. Mechanism-of-action (mechanistic) studies encompass a variety of laboratory approaches with whole animals and in vitro systems that use tissues or cells from humans or animals. In deciding on associations between TBI and human health effects, the committee used evidence only from human studies; in some cases, however, it examined animal studies as a basis of judgments about biologic mechanism or plausibility.
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 105 Epidemiologic Studies Analytic epidemiologic studies examine the association between two or more variables. Predictor variable and independent variable are terms for an exposure to an agent of interest in a human population. Outcome variable and dependent variable are terms for a health event seen in that population. Outcomes can also include a number of nonhealth results, such as use of services, social changes, and employment changes. A principal objective of epidemiology is to understand whether exposure to a specific agent is associated with disease occurrence or other health outcomes. That is most straightforwardly accomplished in experimental studies in which the investigator controls the exposure and the association between exposure and outcome can be measured directly. In the case of TBI followup studies, however, human experiments that directly examine the association between TBI and health outcomes are neither ethically nor practically feasible; instead, the association has to be measured in observational studies, and causality has to be inferred. Although they are commonly used synonymously by the general public, the terms association and causation have distinct meanings (Alpert and Goldberg, 2007). Associations in Epidemiologic Studies There are several possible reasons for associations in observational studies: random error (chance); systematic error (bias); confounding; effectâcause; and causeâeffect. Spurious associations, that is, the finding of an association that does not truly exist, can be due to random error or chance, systematic error or bias, or a combination of them. Random error or chance is a statistical variation in a measurement taken from a sample of a population that can lead to the appearance of an association when none is present or the failure to find an association when one is present. Systematic error or bias is the result of errors in how the study was designed or conducted. Systematic error can cause an observed value to deviate from its true value and can falsely strengthen or weaken an association or generate a spurious association. Selection bias occurs when there has been systematic error in recruiting a study population, which is different from the target population of the study, with the result that the findings cannot be generalized to the target population. Information bias results from a flaw in how data on exposure or outcome factors are collected. Other reasons for finding associations that are incorrect are confounding and effectâcause relationships. Confounding occurs when a third variable, termed a confounding variable (or confounder), is associated with both the exposure and the outcome and mistakenly leads to the conclusion that the exposure is associated with the outcome. Effectâcause relationships occur when the outcome precedes the exposure; for example, a study might suggest that a particular psychiatric outcome was associated with a TBI when the psychiatric condition actually preceded the TBI and increased the risk of a TBI. In a true association, the exposure precedes the outcome and the association is free of random error, bias, and confounding (or the chance of them has been minimized); finding these types of associations is the goal of epidemiologic studies. In epidemiologic studies, the strength of an association between exposure and outcome is generally estimated by using prevalence ratios, relative risks (RRs), odds ratios (ORs), correlation coefficients, or hazard ratios depending on the type of epidemiologic study performed. To conclude that an association exists, it is necessary for the exposure to be followed by the outcome more (or less in the case of a protective exposure) frequently than it would be expected to by chance alone. The strength of an association is typically expressed as a ratio of
106 GULF WAR AND HEALTH the frequency of an outcome in a group of participants who have a particular exposure to the frequency in a group without the exposure. A ratio greater than 1.0 indicates that the outcome variable has occurred more frequently in the exposed group, and a ratio less than 1.0 indicates that it has occurred less frequently. Ratios are typically reported with confidence intervals to assess random error. If a confidence interval (95% CI) for a ratio measure (e.g., an RR or an OR) includes 1.0, an association is said to be not statistically significant. If the interval does not include 1.0, the association is said to be statistically significant. Inferring Causality Determining whether a given statistical association rises to the level of causation requires inference (Hill, 1965); that is, causality is inferred, rather than measured directly, in observational studies. In 1965, Austin Bradford Hill, a British statistician, suggested nine criteria that could be used to assess whether an association observed in an observational study might be causal (Hill, 1965): Strength of association. A strong association is more likely to have a causal component than a modest association. Consistency. An association that is observed consistently in different studies is more likely to be causal than one that is not. Specificity. A factor [or predictor variable] influences specifically a particular outcome or population. Temporality. A factor must precede an outcome that it is supposed to affect. Biologic gradient (also called doseâresponse relationship). An outcome increases monotonically with increasing dose of exposure or according to a function predicted by a substantive theory. Plausibility. An observed association can be plausibly explained by substantive (for example, biologic) explanations. Coherence. A causal conclusion should not fundamentally contradict present substantive knowledge. Experiment. Causation is more likely if evidence is based on randomized experiments. Analogy. An effect has already been shown for analogous exposures and outcomes. Some of those criteria, such as experiment and specificity, are not particularly applicable to TBI, but the remaining ones are important for determining causality. A strong association as measured by a high (or low) risk or ratio, an association that is found in a number of studies, an increased risk of disease with increasing exposure or a decline in risk after cessation of exposure, and a finding of the same outcome after analogous exposures (such as sports injuries) all strengthen the likelihood that an association seen in epidemiologic studies is causal. Exposures are rarely, if ever, controlled in observational studies, and with TBI there can be substantial uncertainty in the assessment of exposure. To assess whether explanations other than causality (such as chance, bias, or confounding) are responsible for an observed association, one must bring together evidence from different studies and apply well-established criteria (Hill, 1965; Susser, 1973, 1977, 1988, 1991; Evans, 1976; Wegman et al., 1997). For a recent review of those criteria, see the 2004 report of the US Surgeon General (Office of the Surgeon General-HHS, 2004). A brief discussion of the more important ones follows.
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 107 Strength of Association The strength of an association is usually expressed as the magnitude of the measure of effect, for example, an RR or an OR. Generally, the farther from 1.0 (higher or lower) the RR or OR is, the greater the likelihood that the association is causal and the lower the likelihood that it is a result of bias or confounding. Measures of statistical significance, such as p values, are not indicators of the strength of an association but are rather measures of the probability of the results being due to random error. Consistency It is desirable to replicate the findings in different studies, that is, to observe an association in several studies done by different investigators in different populations using different study designs before drawing conclusions about the association. The more studies and types of studies in which the association has been observed, the more confident we are that the association is causal. However, consistency by itself is not sufficient evidence of an association. The committee considered findings that were consistently in the same direction (that is, the studies found predominantly positive or negative associations) among studies of different designs to be supportive of an association. It did not require exactly the same magnitude of association in different populations to conclude that there was an association. A consistent association could occur when the results of most studies were positive and the variations in measured effects were within the range expected to be due to random error. Specificity of Association Specificity of association is the degree to which an exposure (in this case, sustaining a TBI) is associated with a particular outcome. A positive finding is more convincing of causality when the association between the exposure and the outcome is specific to one or both than when the association is nonspecific to both. The committee recognized, however, that one-to-one specificity is not to be expected given the differences in type of injury, extent of injury, and locations of injury in the brain. Temporal Relationship If an observed association is real, exposure must have preceded the onset of an outcome. The committee considered whether the outcome occurred within some period after sustaining a TBI that was consistent with current understanding of the natural history of that outcome (to the extent that it was possible). It interpreted the lack of an appropriate sequence as evidence against causality but recognized that insufficient knowledge about the natural history and pathogenesis of many of the health effects under review limited the utility of this consideration. DoseâResponse Relationship The existence of a doseâresponse relationship strengthens an inference that an association is real. However, the lack of an apparent doseâresponse relationship does not rule out an association. If the relative magnitude of exposure among several studies can be determined, indirect evidence of a doseâresponse relationship might exist. For example, if studies of presumably low-exposure cohorts (for example, mild TBIs or a single injury) show only mild increases in risk whereas studies of presumably high-exposure cohorts (for example, moderate to
108 GULF WAR AND HEALTH severe TBIs or repeated injuries) show larger increases in risk, the pattern would be consistent with a doseâresponse relationship. Biologic Plausibility Biologic plausibility reflects knowledge of the biologic mechanism by which a TBI could lead to a health outcome. That knowledge comes through mechanism-of-action or other studies, typically in animals. Biologic plausibility provides a high level of confidence in drawing a conclusion of âsufficient evidence of a causal associationâ (see below). However, a biologically plausible mechanism might not be known when an association is first documented. Types of Observational-Study Designs Epidemiologic-study designs differ in their ability to provide evidence of an association (Ellwood, 1998). It must be noted that the studies reviewed by the committee were seldom designed specifically to answer the question in the committeeâs charge, that is, whether sustaining a TBI during combat results in long-term adverse health outcomes. In examining the available epidemiologic studies, the committee addressed the question, Does the available evidence support a causal association between exposure (sustaining a TBI) and an outcome (a health effect)? Even a finding of a causal association between a TBI and a specific health effect does not mean that a TBI invariably results in the health effect or that all cases of the effect are the result of deployment. As discussed above, Hillâs criterion of specificity is not particularly applicable in studies of TBI given the diffuseness of the injury, or even in injury epidemiology in general, and in any event such complete correspondence between exposure and effect is the exception in large populations (IOM, 1994). The committee evaluated the data and based its conclusions on the strength and coherence of the data that resulted from the selected epidemiologic studies that met its inclusion criteria. The major types of epidemiologic studies that the committee considered were cohort studies, caseâcontrol studies, and cross-sectional studies. In each case below, exposure, for purposes of this study, means sustaining a TBI. Cohort Studies A cohort, or longitudinal, study follows a defined group, or cohort, over some period. It can test hypotheses about whether a TBI is related to the development of a health effect and can examine multiple health effects that may be associated with a TBI. Our review looked for evidence in cohort studies that compared health effects in people who had a TBI with effects in those who did not. The committee gave substantially less weight to cohort studies that included only persons with TBI and measured outcomes as a function of factors other than TBI, such as age. Those types of studies are valuable for determining risk factors other than TBI for specific outcomes, but they do not provide information on whether a particular outcome, such as Parkinson disease, is associated with TBI. Cohort studies can be used to estimate risk difference, RR, and hazards, all of which measure the strength of an association. The risk difference is the rate of disease or other health effect in exposed persons minus the rate in unexposed persons; a rate greater than zero implies that extra cases of the effect are associated with the exposure. Relative risk is determined by
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 109 dividing the rate of the effect in the exposed group by the rate in the unexposed group. An RR greater than 1 suggests a positive association between an exposure and an outcome, and a value less than 1 suggests a protective association. The farther the RR is from 1.0 (in either direction), the stronger the association. One major advantage of a cohort study is the ability of the investigator to define the exposure classification of subjects at the beginning of the study. Because participants are followed over time, it avoids problems with the temporal sequence between an exposure and an outcome; that is, it avoids the problem of effectâcause associations. Classification of exposure in prospective cohort studies is not influenced by the presence of a health effect, because the health effect has yet to occur, and this reduces an important source of potential bias known as misclassification bias (see later discussion). The disadvantages of cohort studies are the high costs associated with the use of a large study population, the long periods needed for followup (especially if the effect is rare), attrition of study subjects, and delay in obtaining results. A prospective cohort study selects subjects on the basis of exposure (or lack of it) and follows the cohort to some date to determine whether and at what rate the outcome develops. A retrospective (or historical) cohort study differs from a prospective study temporally in that the investigator traces back in time to classify past exposures in the cohort and then tracks the cohort forward to the present to ascertain the rate of the outcome. The investigator often focuses on mortality from the outcome because of the relative ease of determining vital status of people and the availability of death certificates to determine the cause of death. Standardized Mortality Studies For comparison purposes, some cohort studies use mortality or morbidity rates in the general population because it might be difficult to identify a suitable group of unexposed people, especially if the outcome is rare. An example of this is the standardized mortality ratio (SMR), which is the ratio of the observed number of deaths in a cohort (from a specific cause, such as TBI) to the expected number of deaths in a reference population. An SMR greater than 1.0 generally suggests an increased risk of death in the exposed group. Such measures can also be used to examine morbidity, such as cancer. The major problem in comparing rates in the general population with rates in military cohorts is the âhealthy-warrior effect.â That effect arises when a military population experiences a lower mortality or morbidity rate than the general population, which consists of a mixture of healthy and unhealthy people. Inasmuch as military personnel must meet physical-health criteria when they enter the military and while they are on active duty, the groupâs health status is usually better than that of the general population of the same age and sex. Since military personnel are at overall lower risk of adverse health outcomes compared to the general population, any excess risk associated with an exposure they experience must be large enough to overcome their inherent advantage in order to be detectable by such methods as SMR. CaseâControl Studies In a caseâcontrol study, subjects (cases) are selected on the basis of having the outcome of interest; controls are selected on the basis of not having the outcome of interest. Investigators seek information on specific exposures. Cases and controls can be matched or not in the selection process with regard to such characteristics as age, sex, and socioeconomic status to suppress the influence of confounding variables in any observed differences. The odds of exposure to the
110 GULF WAR AND HEALTH agent in the cases are then compared with the odds of exposure in controls. An OR greater than 1 indicates that there is a potential association between the exposure and the outcome; the greater the OR, the greater the association. An OR less than 1 indicates that the exposure may protect against the outcome. Caseâcontrol studies are useful for testing hypotheses about relationships between specific exposures and an outcome. They attempt to solve the problem of temporality by considering the order of exposure and outcome. They are especially useful and efficient for studying rare diseases and their associated exposures. Caseâcontrol studies have the advantages of ease, speed, and relatively low cost. They are also valuable for their ability to probe multiple exposures or risk factors. However, caseâcontrol studies are vulnerable to several types of bias, such as recall bias, in which cases are more likely to report exposures than controls, which can dilute or enhance an association between a health effect and an exposure. Other problems include identifying representative groups of cases, choosing suitable controls, and collecting comparable information on exposures in both cases and controls. The caseâcontrol study is often the first approach to testing a hypothesis about factors that might contribute to a specific health effect, especially a rare one. A nested caseâcontrol study draws cases and controls from a previously defined cohort that was assembled for other purposes. Thus, it is said to be nested in a cohort study. Baseline data are collected when the cohort is identified, which to some degree avoids the problem of recall bias when the cases and controls are identified. Members of the cohort identified as having, for example, TBI serve as cases, and a sample of those who are TBI-free serve as controls. Baseline data on exposure in cases and controls are compared, as in a regular caseâ control study. Nested caseâcontrol studies are efficient in terms of time and cost in reconstructing exposure histories of cases and controls. In addition, because the cases and controls come from the same previously established cohort, concerns about selection bias are decreased. Cross-Sectional Studies The main distinguishing feature of a cross-sectional study is that exposure and outcome data are collected at the same time. In a cross-sectional study, the strength of an association between an exposure and an outcome is measured as a prevalence ratio, or a prevalence OR. It might compare outcome or symptom rates between groups with and without TBI. Cross-sectional studies are easier and less expensive to perform than cohort studies and can identify the prevalence of exposures and outcomes in a defined population. They are useful for generating hypotheses, but they are much less useful for determining causeâeffect relationships, because collecting exposure and outcome data at the same time makes it impossible to establish which came first. Such studies are also subject to numerous other problems (Monson, 1990). Cross-sectional studies are of limited use for learning about symptom duration and chronicity, latency of onset, and prognosis. INCLUSION CRITERIA The committeeâs next step, after securing the full text of about 1,900 epidemiologic studies, was to determine which studies would be included in its review as primary or secondary
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 111 (support) studies. To be included as a primary study, a study had to be published in a peer- reviewed journal or have undergone an equally rigorous process, had to include details of its methods, had to include an appropriate reference or unexposed group, had to have sufficient statistical power to detect an effect, had to have sufficiently representative followup to ensure external validity, and had to have used reasonable methods to control for confounders and to minimize systematic error. Studies also had to include identification of TBI in a population due to an external physical force rather than a degenerative or congenital condition and had to include long-term outcomes (6 months or longer). Secondary studies are studies that were less rigorous in their methods, and they carried less weight than primary studies. Of the 1,900 studies the committee read and evaluated, many did not meet the committeeâs criteria for inclusion and are not discussed in this volume. Studies of patients with TBI due to malignancy, strokes, infection, ischemia, other diseases or disorders of the brain, intoxication, or oxygen deprivation were not considered. Additionally, the committee did not systematically review studies of children, the elderly, brain- injured patients in litigation for compensation claims, case reports, or case series. Because the committee was charged with assessing only long-term outcomes, studies of outcomes not assessed beyond 6 months were excluded. Methodologic Rigor A study had to be a published in a peer-reviewed journal or other rigorously peer- reviewed publication, such as a government report, dissertation, or monograph; include sufficient methodologic details to allow the committee to judge whether it met inclusion criteria; include an unexposed control or reference group; have sufficient statistical power to detect effects; and use reasonable methods to control for confounders. Exposure Assessment For a study to be considered primary, the committee preferred studies that had an independent assessment of a TBI rather than self-reports of a TBI or reports by family members. It was preferable to have the TBI diagnosed or confirmed by a clinical evaluation, imaging, hospital record, or other medical record. However, unwitnessed self-reports of injury account for the bulk of the TBI literature, and the committee decided that it could not exclude such studies outright. In keeping those studies, the committee is well aware of the potential for misclassification of TBI due to recall bias. Outcome (Health Effect) Assessment The committee preferred studies that had an independent assessment of an outcome rather than self-reports of an outcome or reports by family members. It was preferable to have the health effect diagnosed or confirmed by a clinical evaluation, imaging, hospital record, or other medical record. For psychiatric outcomes, standardized interviews were preferred, such as the Structured Clinical Interview for DSM-IV-TR (Diagnostic and Statistical Manual of Mental Disorders Fourth Edition, Text Revision), the Diagnostic Interview Schedule, and the Composite International Diagnostic Interview; similarly, for neurocognitive outcomes, standardized and validated tests were preferred. Additionally, the outcome had to be diagnosed after sustaining the TBI.
112 GULF WAR AND HEALTH CONSIDERATIONS IN ASSESSING THE STRENGTH OF EVIDENCE The committeeâs process for reaching conclusions about TBI and its potential for adverse health outcomes was collective and interactive. Once a study was included in the review because it met the committeeâs criteria, there were several considerations in assessing causality, including strength of the association, presence of a doseâresponse relationship, presence of a temporal relationship, consistency of the association, and biologic plausibility. Categories of Association The committee attempted to express its judgment of the available data clearly and precisely. It agreed to use the categories of association that have been established and used by previous Committees on Gulf War and Health and other Institute of Medicine committees that have evaluated vaccine safety, effects of herbicides used in Vietnam, and indoor pollutants related to asthma (IOM, 2000, 2003, 2005, 2006, 2007). Those categories of association have gained wide acceptance over more than a decade by Congress, government agencies (particularly the Department of Veterans Affairs), researchers, and veterans groups. The five categories below describe different levels of association and sound a recurring theme: the validity of an association is likely to vary to the extent to which common sources of spurious associations could be ruled out as the reason for the observed association. Accordingly, the criteria for each category express a degree of confidence based on the extent to which sources of error were reduced. The committee discussed the evidence and reached consensus on the categorization of the evidence for each health outcome in the various outcome chapters (Chapters 6â10). Sufficient Evidence of a Causal Relationship Evidence is sufficient to conclude that there is a causal relationship between sustaining a TBI and a specific health outcome in humans. The evidence fulfills the criteria of sufficient evidence of an association (below) and satisfies several of the criteria used to assess causality: strength of association, doseâresponse relationship, consistency of association, temporal relationship, specificity of association, and biologic plausibility. Sufficient Evidence of an Association Evidence is sufficient to conclude that there is a positive association; that is, a consistent association has been observed between sustaining a TBI and a specific health outcome in human studies in which chance and bias, including confounding, could be ruled out with reasonable confidence as an explanation for the observed association. Limited/Suggestive Evidence of an Association Evidence is suggestive of an association between sustaining a TBI and a specific health outcome in human studies but is limited because chance, bias, and confounding could not be ruled out with reasonable confidence.
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 113 Inadequate/Insufficient Evidence to Determine Whether an Association Exists Evidence is of insufficient quantity, quality, consistency, or statistical power to permit a conclusion regarding the existence of an association between sustaining a TBI and a specific health outcome in humans. Limited/Suggestive Evidence of No Association Evidence from several adequate studies, covering the full range of severity of TBIs that humans are known to encounter, is consistent in not showing a positive association between sustaining a TBI and a specific health outcome. A conclusion of no association is inevitably limited to the conditions, magnitudes of exposure (types of TBIâmild, moderate, and severe or penetrating), and length of observation in the available studies. The possibility of a very small increase in risk of the health outcome after sustaining a TBI cannot be excluded. LIMITATIONS OF STUDIES Many of the studies reviewed by the committee presented substantial obstacles to determining associations between TBI and long-term health outcomes because they were beset by limitations that are commonly encountered in epidemiologic studies, including lack of representative sample, selection bias, lack of control for potential confounding factors, self- reports of exposure and health outcomes, and outcome misclassification. A studyâs representativeness, even if it is population-based, can be compromised by low participation rates and loss to followup. Low participation rates can introduce selection bias, for example, if people who are symptomatic choose to participate more frequently in studies than those who are not symptomatic. Similarly, loss to followup can result in attrition bias, a form of selection bias, particularly if attrition is associated with disease status. Researchers not only try to measure selection bias by comparing baseline characteristics of participants with nonparticipants or characteristics of those lost to followup with those followed but can make adjustments to estimate the magnitude and direction of its effects. Some of the studies reviewed by the committee did not specify the time between the injury and the followup period, so the committee could not determine whether the outcome lasted longer than 6 months. Many studies involved populations in rehabilitation centers where subjects might have had multiple injuries that included TBI, but the initial TBI might have been due to a stroke or a brain tumor. Those studies presented several problems, such as lack of representativeness of the younger veteran population and an inherent selection bias; for example, they might include only people who have health insurance. Most cohort studies rely on self-reporting of symptoms on questionnaires. Symptom self- reporting potentially introduces reporting or recall bias, which occurs when the group being studied reports what it remembers more frequently than a comparison group. Reporting bias can lead to overestimation of the prevalence of symptoms or diagnoses in the TBI population. Symptom self-reporting might sometimes introduce another type of bias known as outcome misclassification, which leads to errors in how symptoms are classified into outcomes and analyzed.
114 GULF WAR AND HEALTH Other limitations of the body of evidence are that studies might be too narrow in their assessment of health status, the measurement instruments might have been too insensitive to detect abnormalities that affect deployed veterans who had TBI, and the period of investigation might have been too brief to detect health outcomes that have a long latency or require many years to progress to the point where diagnosis, disability, hospitalization, or death occurs. Apart from some large population-based studies of mortality after TBI and a few others of neurologic outcomes, many of the studies evaluated by the committee had small samples. When a study sample is too small, it is possible to miss clinically important differences. That is known as type II error. In such studies, attempts to examine even smaller subpopulations magnify the difficulties and reduce the likelihood of detecting meaningful differences. Of the studies examined by the committee, those with small samples were also sometimes hampered by other problems discussed above, including low participation rates, loss to followup, inadequate duration of followup, and self-reporting of symptoms. An additional limitation of the studies under review is the lack of uniformity in defining the severity of TBI. Studies typically note whether the injury was a penetrating or a closed-head injury, but often used different criteria to assess severity. Thus, the committee found it difficult to compare outcomes among studies, particularly in the âmoderateâ TBI category as researchers used different lengths of time of loss of consciousness and of posttraumatic amnesia to define severity. Similarly, the range of scores on the Glasgow Coma Scale 1 was not always uniform in defining mild, moderate, and severe TBI. The committee focused on studies of people who had sustained a TBI, that followed the subjects to determine long-term sequelae, and generally asked whether a specific outcome was more likely in people with TBI than in controls without TBI. The committee discussed characteristics of the optimal control group for such studies because the type of controls could influence inferences drawn from the studies examined. When the outcome was a medical condition or a social outcome, the committee considered the best comparison group to be controls with other traumatic injuries but without TBI (such as fractures) in the same facility as the subjects with TBI, because such controls permit examination of the effect of TBI on outcome independently of the general effects of trauma and of the common risk factors that lead to traumatic injury. When the outcome studied was death, the committee agreed that comparison with age- and sex-specific mortality in the general population provided the best comparison. The committee found many studies that met its criteria for inclusion. However, many excellent studies were excluded from the review because they were not designed to answer the question posed to the committee: What are the long-term outcomes associated with sustaining a TBI? 1 The Glasgow Coma Scale (Teasdale and Jennett, 1974) is a widely used scale to assess acute injury severity.
CONSIDERATIONS IN IDENTIFYING AND EVALUATING THE LITERATURE 115 REFERENCES Alpert, J. S., and R. J. Goldberg. 2007. Dear patient: Association is not synonymous with causality. American Journal of Medicine 120(8):649â650. Ellwood, J. M. 1998. Critical Appraisal of Epidemiological Studies and Clinical Trials, 2nd ed. Oxford: Oxford University Press. Evans, A. S. 1976. Causation and disease: The HenleâKoch postulates revisited. Yale Journal of Biology and Medicine 49(2):175â195. Hill, A. B. 1965. The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine 58:295â300. IOM (Institute of Medicine). 1994. Veterans and Agent Orange: Health Effects of Herbicides Used in Vietnam. Washington, DC: National Academy Press. âââ. 2000. Gulf War and Health, Volume 1: Depleted Uranium, Pyridostigmine Bromide, Sarin, Vaccines. Washington DC: National Academy Press. âââ. 2003. Gulf War and Health, Volume 2: Insecticides and Solvents. Washington, DC: The National Academies Press. âââ. 2005. Gulf War and Health, Volume 3: Fuels, Combustion Products, and Propellants. Washington, DC: The National Academies Press. âââ. 2006. Gulf War and Health, Volume 4: Health Effects of Serving in the Gulf War. Washington, DC: The National Academies Press. âââ. 2007. Gulf War and Health, Volume 6: Health Effects of Deployment-Related Stress. Washington, DC: The National Academies Press. Monson, R. 1990. Occupational Epidemiology, 2nd ed. Boca Raton, FL: CRC Press. NRC (National Research Council). 1991. Animals as Sentinels of Environmental Health Hazards. Washington, DC: National Academy Press. Office of the Surgeon General-HHS. 2004. The Health Consequences of Smoking: A Report of the Surgeon General. http://www.surgeongeneral.gov/library/smokingconsequences (accessed July 31, 2008). Susser, M. 1973. Causal Thinking in the Health Sciences: Concepts and Strategies of Epidemiology. New York: Oxford University Press. âââ. 1977. Judgment and causal inference: Criteria in epidemiologic studies. American Journal of Epidemiology 105(1):1â15. âââ. 1988. Falsification, verification, and causal inference in epidemiology: Reconsideration in the light of Sir Karl Popper's philosophy. In Causal Inference. Edited by K. J. Rothman. Chestnut Hill, MA: Epidemiology Resources. Pp. 33â58. âââ. 1991. What is a cause and how do we know one? A grammar for pragmatic epidemiology. American Journal of Epidemiology 133(7):635â648. Teasdale, G., and B. Jennett. 1974. Assessment of coma and impaired consciousness. A practical scale. Lancet 2(7872):81â84. Wegman, D. H., N. F. Woods, and J. C. Bailar. 1997. Invited commentary: How would we know a Gulf War syndrome if we saw one? American Journal of Epidemiology 146(9):704â711.