In this chapter the committee describes its two-phase approach for identifying and screening the literature and other existing evidence addressing potential long-term adverse health effects of the antimalarial drugs of interest. The process that the committee used to assess individual studies, including considerations concerning specific methodologic factors (such as study design, exposure assessment, outcomes assessment, and potential biases), is presented along with the types of studies identified and considered. How these methodologic considerations were applied to interpret the evidence is presented in the specific antimalarial drug chapters. The chapter concludes with a discussion on the process and classification system used to draw conclusions regarding the strength of evidence concerning the long-term health effects associated with the drugs of interest.
The committee was tasked with comprehensively reviewing, evaluating, and summarizing the scientific literature related to long-term health effects that might be related to the use of currently available drugs for the prophylaxis of malaria in adults. Because some terms are used interchangeably in the literature, the committee endeavored to be as precise as possible in its terminology, and thus it adopted the definitions in Box 3-1 and uses them throughout the report. A conservative cutoff time of 28 days (which was considered equivalent to expressions of 4 weeks or 1 month) post-cessation of drug use was used to distinguish between events that are of short-term duration (and thus considered to be outside of the committee’s scope) and those that are persistent or of long-term duration. The 28-day cutoff
was chosen because it allowed for a sufficient washout period for the drugs of interest (the longest half-lives are approximately 14 days for both mefloquine and tafenoquine). Long term has been used in the literature with different interpretations. Given that prophylactic drugs for malaria should be used for the duration of a stay in a malaria-endemic area (as well as for multiple days or weeks after leaving the endemic area, depending on the antimalarial used), “long term” may refer to the timing of the drug use rather than to the timing of events that persist after drug use has been terminated. Therefore, the committee preferentially uses persistent to describe those adverse events that began during the period of drug use and that continued after drug cessation and beyond the period that the drug would still be present, which is defined as ≥28 days post-cessation. Adverse events that occur or change in their severity with prolonged use of an antimalarial drug are considered to be acute events because they occur while the drug is in use; if they do not persist once the use of the drug has ceased, they are outside the committee’s charge of examining the evidence related to persistent health effects. Events that occur during drug use or that continue for a period extending from a few hours to less than 28 days after drug cessation have been referred to in the literature as acute or short-term events, but the committee uses the term concurrent events. The committee uses concurrent to identify events that begin with the use of a drug, not outcomes that may be present before use is begun (e.g., an individual starts a drug and then displays symptoms of hypertension rather than has hypertension and then starts a drug). Latent events refer to those adverse events that were not apparent during the period that the drug was in use but that were present at any time after the cessation of malaria prophylaxis. The focus of the assessment was on research that examined persistent or latent adverse events, both of which indicate the pres-
ence of adverse health outcomes that extend beyond the period during which the user was taking the drug.
To begin, the committee oversaw extensive searches of the scientific and medical literature using a comprehensive strategy. Although antimalarial drugs used by the U.S. military currently or within the past 25 years were the primary focus, the committee’s review also included studies of antimalarials used for prophylaxis in populations other than U.S. service members or veterans.
Literature Search Strategy
Under the direction of the committee, a National Academies of Sciences, Engineering, and Medicine staff research librarian conducted comprehensive electronic searches of the medical and scientific literature using three primary databases: TOXLINE, Index Medicus, and Embase. These three searchable databases index biologic, chemical, medical, and toxicologic publications. If any of the search terms were included in the title, abstract, or key words of the article (or the full text if available for search), the article was included in the results of the search. Search terms included full and abbreviated chemical names, common and manufacturer trade names, and the chemical abstracts service numbers for each of the antimalarial drugs of interest. A multi-purpose field code was included in the search parameters to ensure that all of the synonyms for the drugs of interest were retrieved in the searches. The search strategy was designed to ensure that all potentially relevant articles were captured, and it was not restricted by specific dates, publication types, populations, or species (experimental animal studies were included). The language was restricted to English.
For those drugs of interest that are indicated for uses other than malaria prophylaxis, additional terms and MeSH1 descriptors were added. For example, doxycycline is approved for many uses, and more than 5,500 titles and abstracts were initially captured, so the search was revised to include additional terms related to “prophylaxis” and “malaria.” As a result, the identified list was reduced to a more manageable 2,200 publications which were more likely to be relevant, while avoiding concerns about excluding any potentially relevant articles. Any adaptations made regarding the search strategy or screening criteria for a drug is discussed in the drug-specific chapters that follow.
Using the search terms in Box 3-2, the databases were searched twice. The first search of the literature included the earliest date of the database up to December 2018. A subsequent search was conducted in August 2019 to capture any relevant articles published or indexed after the initial search through July 31, 2019.
1 MeSH descriptors are sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. For example, MeSH terms for “malaria” include nine terms such as “falciparum,” “vivax,” and “Blackwater fever,” without those terms having to be specified individually.
TOXLINE (1840s–present) is a bibliographic database published by the National Library of Medicine which contains more than 4 million records, with new records added weekly. The database contains an assortment of citations from specialized journals and other sources including PubMed citations. It provides references covering the biochemical, pharmacologic, physiologic, and toxicologic effects of drugs and other chemicals. Most of TOXLINE’s bibliographic citations contain abstracts or indexing terms and chemical abstract service registry numbers.
Index Medicus, a second database produced by the National Library of Medicine, covers citations indexed in PubMed and Medline. Citations in PubMed are fully indexed from 1966 to the present and selectively from 1809 to 1966, with a total of more than 25 million records. Index Medicus covers scientific literature in the areas of medical, biomedical, and life sciences and provides automatic mapping of search terms with MeSH terms. The focus of citations found in PubMed includes “in process” or “before print” citations as well as some citations from non-medical journals (particularly in public health, social science, psychology, and sociology) and ebooks (including several reports from the National Academies of Sciences, Engineering, and Medicine [the National Academies]). Medline
(1946–present) contains more than 22 million records on medical and biomedical sciences from approximately 5,600 journals (most of which are published in the United States).
Embase is an Elsevier database that contains more than 30 million records from more than 8,500 journals from at least 90 countries and is available by subscription through a number of interfaces, including the OVID interface that was used for the committee’s searches. Citations cover all those indexed in Medline as well as more than 2,000 additional drug and pharmacy journals, which include journals published outside the United States, and 260,000 conference abstracts. The citations are fully indexed from 1947 to the present and selectively back from 1947 to 1902. This database is considered one of the most important databases for identifying studies typically associated with evidence-based practice, including meta-analyses, systematic reviews (such as those reviews by Cochrane), randomized controlled trials, cohort studies, case–control studies, case series, and other epidemiologic publications. Embase is also an extremely important database for identifying grey literature, such as reports from the Food and Drug Administration (FDA) and the National Institutes of Health.
Two supplemental databases of malaria-specific literature (the Malaria in Pregnancy Consortium library2 and WWARN.org) were also searched using the generic and trade names for each antimalarial of interest. WWARN.org maintains a clinical trials publication library and a pharmacology publication database. Potentially relevant articles that were not captured by the search were also identified by searching the reference lists of relevant review and research articles.
Several types of publications were captured: epidemiologic studies, case reports and case series, clinical trials, laboratory animal studies, in vitro studies, reviews, meta-analyses, summaries of expert meetings, clinical and travel-based guidelines, conference abstracts, commentaries, and letters to the editor. Exact duplicate articles were deleted. An individual EndNote library was set up for each of the six drugs of interest. If an article examined multiple drugs, the article was placed into the library of each drug examined. For example, if a study examined mefloquine and atovaquone/proguanil, it was placed into both the mefloquine and atovaquone/proguanil libraries for further review. A study that reported on multiple drugs of interest was assessed for relevance in each of those chapters.
Use of Other Sources
In addition to carrying out the comprehensive literature search for studies that contained original data collection and analysis, the committee considered other sources of information in their deliberations, including review articles, national and foreign government reports, responses to committee-generated information
requests, and information submitted by the public through invited presentations, comments, and data submissions.
Reviews and Other Non-Original Data Collection
Peer-reviewed studies with original data collection and analyses were preferred over studies that were re-analyses of a population (without the incorporation of additional information), pooled analyses or meta-analyses, reviews, and so on. Studies with original data were preferentially considered by the committee when assessing the strength of the association between an antimalarial of interest and a persistent or latent health outcome to draw its conclusions. These other types of studies and publications may be informative and may be discussed in conjunction with primary results or in synthesis sections on a given drug or health outcome.
Systematic reviews, such as those published by the Cochrane Collaboration, on topics of interest were also considered part of the evidentiary base. Although the committee did not assess review articles exhaustively, it did consider them for specific topics, such as the known biologic mechanisms of action and the pharmacokinetic and pharmacodynamic properties of the antimalarials of interest and their concurrent adverse events. Commentaries, opinions, letters to the editor, and author responses that referred to an included article were captured and considered along with the original article. National and global recommendations on malaria prophylaxis (by the Centers for Disease Control and Prevention [CDC], World Health Organization, European Union, etc.) were reviewed when they specifically reported on the rationale for changes to the recommendations for antimalarial prophylaxis. Data presented only in abstract form, such as from conferences, or in other unpublished formats were not included.
Formal government reports on the drugs of interest from U.S. agencies or foreign governments were reviewed. Individual reports on adverse events from the FDA Adverse Event Reporting System (FAERS) were not requested or reviewed. However, if a publication used FAERS reports or the equivalent from other countries as part of its analysis, the committee considered it. The committee downloaded available drug labels and package inserts from FDA’s website for each of the drugs of interest. These were used to provide information concerning specific changes and updates to the use of the drug or the warnings and contraindications associated with it. Package inserts are listed on the webpage with an action date, but the date provided in the downloaded package-insert document may occasionally differ from the action date posted on the webpage (e.g., a downloaded document listed as the 1989 mefloquine package insert was a July 2002 revision). Occasionally a downloaded document contained no date (e.g., the template’s “month/year” placeholder is not filled). Additional requests
for information were made to FDA, the Department of Defense (DoD), and the Department of Veterans Affairs (VA). Those requests and the received responses are part of the committee’s public access file. The received information was integrated with the other evidence for drugs of interest.
As part of fulfilling its Statement of Task, the committee held two open sessions to assist in information gathering which served to inform the discussions throughout this report. The first presentation was made by representatives of VA to formally charge the committee with their Statement of Task and to answer clarifying questions related to the charge. The committee heard from presenters from DoD, the Department of State, and the Peace Corps with knowledge of malaria prophylaxis policies. In addition to presentations focused on the malaria prophylaxis policies of different government agencies, representatives from FDA gave an overview of the FDA’s postmarketing pharmacovigilance system of adverse events and of how that information is used to monitor for signs of safety issues. A representative of CDC explained how the agency assembles and weighs data for making country-specific recommendations for malaria prophylaxis for U.S. travelers. Since those recommendations are based largely on the published literature, the second part of the CDC presentation reviewed some of the common strengths and limitations of pertinent literature. The committee heard from an advocacy organization that presented a hypothesis for the existence of a neuropsychiatric disease that the organization believes to be associated with the use of mefloquine prophylaxis in U.S. military service members. Finally, the committee heard a detailed presentation on the neurotoxic mechanisms of some antimalarials, particularly artemisinins. A more detailed summary of each invited presentation is found in Appendix B.
Each open session included time for attendees to make statements for the committee’s consideration. Additionally, for the duration of the deliberation process, members of the public were encouraged to submit data and testimonials to the committee through the study email. Many of the public comments received and the in-person statements given described personal experiences of persistent effects following the use of mefloquine for malaria prophylaxis while the individual was serving in the military, the Department of State, or the Peace Corps or during personal travel. Several of those who testified on their experiences with mefloquine asked the committee to clearly communicate any limitations of the data used to base its conclusions, and to convey its thinking on research that may still be needed.
During the course of its work, the committee read and heard many moving personal accounts of individuals suffering from debilitating symptoms after using
certain antimalarial drugs. The committee appreciated the opportunity to hear these accounts firsthand and understood the tremendous effort and strength that was required to speak publicly about these very personal experiences. Although the committee was not tasked with making judgments regarding specific cases in which individuals have claimed injury from use of an antimalarial drug, the reports from these individuals were welcomed, and the committee appreciated their desire to contribute in a positive way to the information gathering of the committee.
Submissions to the committee also included information on two planned postmarketing safety studies of tafenoquine (Arakoda™); statements that veterans’ medical records submitted to FDA via MedWatch played a role in FDA’s issuing of a boxed warning for mefloquine and that neurovestibular and neuro-ocular symptoms associated with mefloquine are not found in the published mefloquine literature; calls for examining the interactions of malaria-prophylactic drugs with other drugs when considering adverse effects; and requests that all sources of information be considered, including information from clinicians who diagnosed mefloquine-related disorders and medical records from the War-Related Illness and Injury clinics.
This section details the methods and two-step process used by the committee for screening the results of its searches to identify potentially relevant literature for full-text review. The first step involved screening for relevance by title and abstract, as available. The second step entailed a full-text review to determine the final set of studies that the committee considered, assessed, and synthesized. It was this final set of studies that provided the basis for the committee’s conclusions on the relationships between the use of an antimalarial drug and specific categories of adverse health effects. The quantitative and qualitative procedures underlying the committee’s literature evaluation have been made as explicit as possible, but ultimately the conclusions about associations expressed in this report are based on the committee’s collective judgment. The committee has endeavored to express its judgments as clearly and precisely as the data allow.
A total of approximately 11,700 titles and abstracts were captured in the literature search, covering all six drugs of interest. In step 1 of the process, article titles and abstracts were screened for relevance by the National Academies’ Health and Medicine Division staff under the committee’s direction to determine which articles should be considered for full-text retrieval. The screening criteria are outlined below. For each drug, two reviewers performed the initial screening. Titles and abstracts, where available, were reviewed to screen out articles that did not
meet the committee’s inclusion criteria. When the two primary reviewers were not in agreement, a third reviewer made the determination whether to include an article. Articles that did not have abstracts were generally passed to the full-text review stage unless the information included in the title clearly excluded the article. Staff reviewed reference lists of reviews and original articles for relevant articles or other information not picked up in the databases search and added these for consideration during full-text review. Another approximately 300 articles were identified in this way.
Because the committee’s Statement of Task specified that persistent adverse events resulting from the prophylactic use of the antimalarials of interest in adults were of central concern, all publications that reported on a drug of interest used prophylactically were initially considered relevant when screening the literature. However, the committee also set additional criteria for final inclusion. Each article included in the final set must
- report an adverse event or effect (or if none were observed) or other health outcome when the drug was used as a prophylaxis, regardless of the timing of that event;
- have a comparison group;
- follow a population for more than 28 days (or reported as 4 weeks or 1 month); and
- in studies of humans, have study populations constrained to people 16 years or older. Studies of populations with mixed age groups, in which some of the individuals were less than 16 years old, were also included.
If any of this information was not clear from the title or abstract, the article was kept for review at the full-text stage.
Other areas were explored, although not exhaustively, using the human and animal literature. These areas included case reports of adverse events; studies of adherence to a drug of interest when used for malaria prophylaxis; the coadministration of an antimalarial for prophylaxis with sporozoite immunization; the co-administration of an antimalarial for prophylaxis with medications for other common conditions (e.g., antimalarial with warfarin, antihypertensives, insulin, etc.) that report on side effects or adverse events (or if none were observed); and interactions with nonmalarial drugs, supplements, and substances (e.g., food, alcohol, or nicotine). Studies of pharmacokinetics and pharmacodynamics, metabolism, and biologic mechanisms of action (e.g., system pathways, cell signaling, other biologic markers) were also included for drugs of interest or their metabolites. Articles that examined the drugs of interest for the treatment of malaria were considered only for tafenoquine because it was so recently approved by FDA for use, and such articles were considered only if the reported adverse events were not listed in the FDA package insert. For the other drugs of interest, the discussion of adverse events when the drugs were used for treatment was
limited to review articles and discussed as background where relevant. Studies of pregnant women were limited to those who were taking antimalarial prophylaxis or intermittent preventive treatment in which adverse events are specified (either to the mother directly or to the fetus or newborn) or other reproductive outcomes were reported. The committee recognized that the risks of adverse events of the drugs under consideration can be influenced by a host of factors even if the specific mechanisms are not fully understood. Where the committee thought the evidence regarding risks to adult subpopulations with comorbid conditions (e.g., renal failure, cardiovascular conditions, immunosuppressed, human immunodeficiency virus positive status/acquired immunodeficiency syndrome [AIDS]) or having specific demographic features (such as women, older or younger age groups, race or ethnic background, etc.) was informative, these studies are briefly mentioned. However, most of the adverse events observed in these subpopulations are based on studies that reported concurrent use of the drug of interest and thus did not meet the inclusion criteria (described in the next section) to be considered a primary epidemiologic study.
Several types of articles were considered to be outside the committee’s scope of work and were specifically excluded from consideration. These included studies of populations administered antimalarial drugs for a use other than malaria prophylaxis (e.g., for treatment of leishmaniasis, flukes, pneumonia, lupus, rheumatoid arthritis, cancer, or sexually transmitted infections) because studies of populations that use the antimalarial drugs of interest for reasons other than malaria prophylaxis were determined not to be comparable to or representative of the populations using the drugs for malaria prophylaxis; studies that exclusively examined antimalarial efficacy, effectiveness or sensitivity, or drug resistance without mentioning adverse effects (or the lack of them); trends of antimalarial prescriptions (no adverse events reported); studies that focused solely on the effects that an antimalarial of interest had on the malaria parasites or on the use of an antimalarial for the purpose of reducing transmission; and studies that focused on derivatives of the drugs of interest (such as for drug discovery) or drug-delivery systems (e.g., carriers, encapsulations). Additionally, studies that examined the simultaneous administration of an antimalarial drug of interest in combination with any other antimalarial drug that is not an FDA-approved combination (e.g., an artemisinin and mefloquine given at the same time or as a combination pill) were excluded.
In general, studies of recrudescence or relapse of malaria were excluded because they were focused on efficacy. An exception to this was for studies of primaquine and tafenoquine when they are used as presumptive anti-relapse therapy. For these two drugs, studies of malaria relapse were included and reviewed for other adverse events. Additionally, for these two drugs, combinations of prophylactic drugs were included (e.g., chloroquine followed by primaquine).
Approach to Evaluating and Assessing Individual Studies
In step 2 of the literature screening process, full text was obtained for any articles that were considered potentially relevant after applying the step 1 screening criteria for inclusion and exclusion. The committee began its assessment of the literature without regard to whether an association between prophylactic use of an antimalarial and any particular health outcome was suggested in the studies, focusing solely on its relevance to addressing that question. Similarly, because of the variability in the descriptions and diagnoses of the health conditions considered in this report, the committee made no a priori assumptions about the usefulness of any article or report, relying solely on the methods presented to assess the contribution of each study. Each study that met the inclusion criteria was reviewed and objectively evaluated for each health outcome it presented. If a study examined more than one drug or health outcome, it was considered separately for each drug and for each of those outcomes. After a review of more than 3,500 full-text articles, studies that were considered relevant were grouped and evaluated thoroughly. Full-text articles were grouped into categories of primary or supplemental evidence. Epidemiologic studies that presented original information in human populations were considered primary evidence. Supplemental or supporting literature included FDA labels and package inserts, reviews and meta-analyses, considerations of selected populations (such as pregnant women), case reports, additional information from the committee’s information requests, and animal and mechanistic studies. The articles were then distributed among the committee members according to their areas of expertise, with at least two committee members reviewing each paper. All adverse events were considered regardless of severity.
Spontaneous reports of adverse events and case studies provide the least rigorous evidence of an effect. MedWatch, FDA’s program for postmarketing surveillance, collects clinical information involving drugs from health care professionals and consumers through a variety of outlets, including mail, internet, and telephone, but the largest source of postmarketing information on adverse events is the drug companies themselves (IOM, 2007). Often reports of an adverse event lack important details such as the duration of the event or its effects, the tests performed, and if there was any follow-up. Moreover, the adverse event reported in case reports is associated with use of the drug; the drug has not necessarily been proven to be the cause of the adverse event.
Case reports and case series were considered when there was follow-up that lasted at least 28 days after drug cessation, but because these reports lack control groups, they contribute no meaningful information about the degree of risk in a population or even to other individuals who have the same underlying characteristics, and thus their contribution to the weight of the evidence was
considered supportive rather than primary. Case reports of adverse events determined to be attributed to the use of a drug of interest were captured and are presented as supplemental information to the epidemiologic studies, specifically when evidence of a clinician-diagnosed outcome was presented. When case studies were reviewed, the EQUATOR consensus criteria for case studies aided in evaluating the strength of the evidence presented (Gagnier et al., n.d.; Rodgers et al., 2016). These criteria outline the elements that a high-quality case report should include. Reporting of de-identified patient-specific information, primary clinical concerns, and relevant history and previous treatments must be included, for example. Reported diagnostic information encompasses diagnostic methods, challenges, and reasoning. Detailed information about the intervention, follow-up, and outcomes, including adherence and tolerability, are required. Finally, an evaluation of the strengths and limitations, relevant medical literature, and rationale for conclusions are necessary.
Toxicologic studies in animal models and of in vitro cell cultures are included where appropriate to inform the understanding of pharmacokinetics and biologic plausibility through the toxicology of the drugs and their exposure pathways. Throughout the drug-specific chapters, pharmacokinetics refers to how the organism (human or experimental animal model) affects the drug, including via processes of absorption, metabolism, and excretion. Pharmacodynamic mechanisms are covered under the heading of “Biologic Plausibility” in that pharmacodynamics refers to how a drug affects an organism with particular emphasis on dose–response relationships. Because these studies were considered to provide supportive evidence, their results would not be enough to change the level of evidence for an association.
Studies that compared different groups of human populations based on the exposure to antimalarial drugs can be broadly classified as either observational studies or trials. The committee refers to both types of these comparative studies as “epidemiologic studies” throughout the report. The focus of the committee’s assessment is on epidemiologic studies because epidemiology deals with the determinants, frequency, and distribution of disease in human populations rather than in individuals or in animal models, which have several limitations, as discussed below. Several types of epidemiologic studies were evaluated, including randomized controlled trials, cohort studies, case–control studies, and cross-sectional studies. Formal, well-designed, and well-conducted epidemiologic studies can serve to produce evidence of associations between an exposure and health outcomes.
For each full-text epidemiologic article that met the committee’s screening inclusion, an additional criterion question was applied:
Does the study provide any empirical information about adverse effects that begin or persist, or indicate the lack of such events, following at least 28 days after cessation (final dose) of the drug of interest?
Although for step 1 of the screening process the population had to be followed for at least 28 days, during step 2 of the full-text review the inclusion was strengthened to require a follow-up of at least 28 days post-drug-cessation. As long as a study met the criteria, it was included, even if it had severe methodologic limitations. Studies that did not follow their populations for at least 28 days after the final dose of a drug of interest was administered or that did not distinguish the timing of the adverse events (e.g., the follow-up time was more than 28 days after drug cessation but the authors did not distinguish which adverse events occurred inside and outside of the 28-day window) are briefly mentioned but are not evaluated in depth. It is important to note that a study could be well designed and well conducted but have serious limitations in its ability to provide information that had direct bearing on the committee’s work, such as by not distinguishing the timing of adverse events. The committee did not contact study authors for clarifications or additional data. For example, several studies included only a brief statement that “no serious adverse events were reported” without further explanation of what adverse events were examined, how “serious” was defined, or what the timing of those events was.
A total of 21 epidemiologic studies that reported on adverse events that were captured or persisted for more than 28 days are included in this report: Ackert et al., 2019; Andersen et al., 1998; DeSouza, 1983; Eick-Cost et al., 2017; Green et al., 2014; Laothavorn et al., 1992; Leary et al., 2009; Lee et al., 2013; Lege-Oguntoye et al., 1990; Meier et al., 2004; Miller et al., 2013; Nasveld et al., 2010; Rueangweerayut et al., 2017; Schlagenhauf et al., 1996; Schneider et al., 2013,2014; Schneiderman et al., 2018; Schwartz and Regev-Yochay, 1999; Tan et al., 2017; Walsh et al., 2004; and Wells et al., 2006. A table that gives a high-level comparison (study design, population, exposure groups, and outcomes examined by body system) of each of these epidemiologic studies is presented in Appendix C. Although the committee considered using published tools to conduct risk-of-bias assessments for the studies, ultimately it was unable to identify an approach that addressed all of the committee’s needs. Instead, the committee adopted selected components of these tools, primarily the Newcastle–Ottawa Scale (Wells et al., 2019), and applied them in its assessment of individual studies. The PICO (Participants, Interventions, Comparisons, and Outcomes) model is commonly used for characterizing clinical studies for formal systematic reviews and meta-analyses. As this assessment was neither a strict systematic review nor a meta-analysis, the committee used a modified PICO that characterized included studies according to their study design, population, study groups, and body systems examined (see next section on Methodologic Considerations). Based on the details of the study, the description of how adverse events were assessed or measured, and whether it dis-
tinguished between adverse events that began or persisted 28 days after cessation of the drug of interest, an epidemiologic study was classified either as a primary article, in which case it met the inclusion criteria and was thoroughly assessed, or as a secondary supporting article, in which case it did not meet inclusion criteria and was reviewed and more briefly described under the heading Other Identified Studies in Human Populations. Primary articles were assessed for quality based on the methods provided (e.g., adequate control for confounding variables, use of adequate diagnostic instruments, use of appropriate statistical tests; see next section, Methodologic Considerations) and the precision of the reported results. Effect estimates, data, and units of measure are presented as reported in the cited studies, except where otherwise noted. The responsible committee members then presented the information from each relevant study to the full committee for discussion, including the methods used for selecting the study populations and conducting the research (i.e., design, population, length of follow-up, sources of measurement for exposure and adverse events or health outcomes [such as self-reported information, medical records, claims data, validated tests and tools, etc.], the statistical analyses used, adjustment factors, etc.), the results, and a thorough assessment of the strengths, limitations, and potential biases and their implication.
The committee defined a health outcome as any recognized symptom, condition, or diagnosis. As the committee’s Statement of Task specified that neurologic and psychiatric outcomes were to be addressed, and because these outcomes were not assessed consistently across studies, the committee adopted a rubric for categorizing the different outcomes. First, the committee considers all neurologic and psychiatric symptoms and disorders to be brain based. The committee recognizes that some of these experiences may not yet have empirically based neuroanatomi-cal correlates, and it acknowledges that psychosocial factors play an etiologic role in psychiatric symptoms and disorders, but there is generally some functional overlap between “neurologic” and “psychiatric” symptoms and disorders. These categories were evaluated separately, rather than as a general “neuropsychiatric” category because of the specific charge in the Statement of Task. In that vein, some studies reported specific ICD-9-CM3 diagnoses (e.g., Anxiety Disorders 300.0X, 300.2X, 300.3X) or broad categories of ICD-9-CM disorders (e.g., Mental Disorders 290-319), diagnosed by clinicians and coded in medical records. Outcomes in other studies were self-reported diagnoses or symptoms of constructs such as “anxiety,” “depression,” or “dizziness” that were not necessarily based on standardized self-report measures of symptoms. In studies that categorized and reported symptoms as “neuropsychiatric,” the outcomes were separated into psychiatric or neurologic categories of disorders to the extent possible. Central and peripheral nervous system symptoms and disorders such as headaches, confusion, dizziness, vertigo, convulsions, and cognitive impairment were designated as neurologic symptoms. Symptoms, disorders, and diagnoses of depression, anxiety,
3 ICD-9-CM: International Classification of Diseases, 9th Revision, Clinical Modification.
posttraumatic stress disorder (PTSD), psychosis, and insomnia were considered to be psychiatric outcomes.
Those epidemiologic studies that measured nonspecific outcomes, such as biologic markers of effect (e.g., changes in pathophysiology, cell signaling, or hormone levels and blood counts) are considered but are given less weight because of the uncertainty of their relevance to persistent adverse events as opposed to a recognized condition or disease. Several of the included studies assessed multiple outcomes, whereas others focused on a specific system (e.g., cardiovascular outcomes) or event (e.g., methemoglobin levels).
The human population studies that have been conducted on the persistent adverse effects of antimalarial drugs are quite diverse in both their methods and their quality. To assess their contribution to the overall weight of evidence concerning a given drug and health outcome, it is essential to consider the quality of the particular methods used to investigate the association because there is substantial unevenness in the rigor and informativeness of the specific studies. While there are textbooks that give general guidelines for epidemiologic study methods and randomized trials (Friedman, 2015; Gordis, 2004; Rothman et al., 2012), including those that address the interpretation of findings specifically (Savitz and Wellenius, 2016), the committee did not review these concepts in general but rather as they applied specifically to the question at hand, that is, the persistent or latent adverse events of antimalarial drugs.
In bringing in methodologic principles to appropriately weigh the evidence, the committee’s intention was to do so objectively, based solely on the quality of the methods and not on the nature or implications of the findings. Some studies that met the inclusion criteria and are summarized in the following chapters had a rather high level of credibility based on the quality of the work, whereas others were virtually non-contributory based on their methods, and the committee provides the rationale by which such judgments were made. The committee sought to be as transparent as possible in indicating the underlying bases for its judgments. Before considering what substantive conclusions were justified based on the research for a given antimalarial drug and health outcome, the committee considered the overall quality of the body of available research.
In addition to the quality of individual studies, it is important to consider the number of such studies, which also tends to be quite limited, especially for certain antimalarial drugs. The need for replication is quite clear, and the evidence base should ideally consist of many studies with varying strengths and limitations to identify a pattern that can be discerned in a series of imperfect studies. To supplement the information provided by epidemiologic studies, the committee drew on knowledge of the biologic underpinnings of the phenomenon of interest, evaluating the degree to which the association of a specific drug and a specific adverse
event is plausible based on the known biologic pathways by which such an impact could occur. This is another aspect of the search for convergent evidence, in this case not just across studies but across disciplines.
Given the small volume of distinct types of studies of markedly varying quality, the committee chose to summarize them by discussing each of the pertinent studies and integrating that assessment without formal weighting by quality or precision. Given how heterogeneous they are, the studies did not lend themselves to pooling, and there were too few of them for more formal methods of assessing the quality of information. Instead, for each study that met the inclusion criteria, the study methods are described, the implications of those methods on the results and inferences that can be made are discussed, and an assessment is presented of the contribution that the study makes individually and in the aggregate to the evidence base. The committee recognizes the challenges in traditional hypothesis testing and over-reliance on “statistically significant” p values that rely on arbitrary cutoffs. Throughout the report the findings and results of studies are reported as they appear in the published papers, but in drawing conclusions the committee weighted consistency of direction of associations over specific statistically significant findings, and the body of evidence was considered as a whole. In its examination and assessment of the available evidence, the committee was looking for signals of associations and it endeavored to be sensitive rather than specific, so that even isolated findings that may well reflect random error from making multiple comparisons, or those that have not been corroborated, are reported. Ultimately, replications of results were considered indications of stronger evidence for an association that the committee considered in its weighing but in assessing the rather limited literature, some of the indications may not be confirmed with further research. The committee notes that although most of the studies reported the results of two-sided tests, which formally assess only whether there is a difference between two groups (which could be in either direction), for simplicity and readability the committee generally discusses the results as “increased” or “decreased” based on the magnitude and precision of the point estimate; in doing so it does not mean to imply that a formal one-sided hypothesis test was done (which was never or rarely the case).
Randomized controlled studies are considered the “gold standard” for evaluating the efficacy of drugs and other therapeutic interventions. With few exceptions, FDA requires having this type of evidence demonstrating both efficacy and safety before it approves a new drug for licensure. Typically licensing a new drug requires randomized controlled clinical trials in which there is a comparison with placebo. Such trials are often limited to healthy populations, may be too small to detect uncommon adverse events, and may be too short to detect delayed adverse events. In addition, clinical trials enroll volunteers who are often healthier than the
populations that will eventually be exposed to the drugs; however, this requirement may help to enhance generalizability to the population of interest, since military populations also are comprised of selectively healthy individuals. Clinical trials also often exclude individuals with specific comorbidities or other exposures that could affect the response to the drug. Thus, large observational studies are important complements to trials, especially when assessing drug safety.
Most drug approvals require trials with placebo comparators and the masking of exposures to ensure unbiased reporting and an accurate assessment of symptoms specific to the drug compared with no drug. However, important adverse events may be missed in such placebo-controlled trials for a variety of reasons, including the presence of symptoms that are uncommon, that are more likely in volunteers excluded from participating (e.g., those with a history of mental illness), or that were not specifically assessed (such as many neuropsychiatric symptoms). When there is a specific indication for a drug, as exists for malaria prophylaxis, patients and prescribers find it useful to make comparisons between alternative drugs to help make the best choice of agent for individual patients. Observational studies have the advantage of using “real-world” populations, and often include larger numbers of exposed persons than clinical trials, but most lack a comparable non-exposed group. Observational studies of adverse events to a drug often compare users of one drug to those of another drug used for the same indication to help control for factors associated with receiving care for the specific indication and for being prescribed or filling a prescription for that indication. As such, the comparison is limited to relative rather than absolute risks of adverse events. The committee did not prioritize one type of exposure comparison over another (i.e., placebo versus another drug); instead, in its assessment, the committee used comparison groups as one factor to identify studies that were methodologically strong. The synthesis was based on the strength of the evidence including consistency between studies.
Although observational studies (cohort and case–control studies, among others) have the advantage of evaluating people who are using the drug of interest in real-world settings, a major challenge is identifying an appropriate comparison group. Ideally, the comparison group should consist of individuals who are similar to those taking the drug in both their eligibility to take the drug of interest and in their baseline risk of developing the outcomes of interest. To assess this, it is important to have information about both groups so that the baseline characteristics can be compared and important differences can be controlled for when assessing adverse events following exposure. This is a challenge since some factors associated with developing adverse events are unknown or known but not ascertained and, if they are distributed differently in exposed and comparison groups, can result in biased estimates of association.
Observational studies are also at risk for channeling bias. Channeling bias can occur when different drugs with similar indications are prescribed to individuals with different risks for potential adverse outcomes (independent of the drug). For
example, those with a personal or family history of mental illness may avoid or not be prescribed mefloquine, and those who want to avoid gastrointestinal distress may avoid or not be prescribed doxycycline. There are analytic methods to help address such imbalances, but the reasons why people receive a specific drug are not always documented and may be difficult to account for.
Case reports and case series provide valuable information about the possibility of an adverse outcome due to a drug, but they rarely suffice to prove a causal association. Case reports may also be helpful in defining a new syndrome (e.g., eosinophilic myalgia syndrome and AIDS) (Vandenbroucke, 1999). Developing a specific case definition based on case reports may assist investigators to design studies that can address the specific drug–disease associations of interest. However, it is important to note that serious adverse events can also occur by chance following the introduction of any new drug or vaccine. A temporal relationship between exposure and outcome is necessary for making a causal inference, but given the lack of comparison to individuals without exposure to the drug, it is not sufficient. FDA may require drug labeling changes to include information from case reports if the outcomes reported are serious or if they are frequently reported following that drug exposure. However, further evidence, such as from randomized trials or rigorous non-experimental studies with carefully selected comparison groups, is usually needed to determine whether the drug is causally associated with a higher risk of experiencing the adverse event.
Thus, there are a number of potentially informative research strategies, such as large randomized trials with sufficiently long-term follow-up or observational studies that have comparison groups that are not strongly affected by bias or other insurmountable sources of likely confounding, with case reports supporting the findings of more rigorous designs.
In addition to the systematic biases and errors that may arise, random error and uncertainty in estimates are also important considerations. Data are rarely available on all of the possible people and outcomes for a given population, so statistical approaches are used to appropriately represent that uncertainty. The statistical power of a particular study is also an important consideration, especially when examining (sometimes rare) adverse events. Formally, statistical power refers to the probability that a particular statistical test (e.g., an effect estimate comparing outcomes between treatment and control groups in a randomized trial) will “reject” the null hypothesis (e.g., that there is no treatment effect) if in fact a specific alternative hypothesis (e.g., that there is an effect) is true. In lay terms, the statistical power refers to the ability of a study to detect a “true” effect when such an effect exists. A particularly relevant concern for the studies examined in this report is that if the statistical power is not sufficiently high, an apparent lack of association between some exposure (such as the use of a particular antimalarial
drug) and an outcome could be the result of a sample size that is not large enough to allow the detection of an effect. Books such as Cohen (1988) and Kraemer and Blasey (2015) provide additional details on power analysis calculations.
Statistical power depends on many things, including the study design, the statistical analysis conducted, and how common the outcome of interest is. This is of particular relevance (and concern) when trying to study adverse events, which are often rare. As noted above, randomized controlled trials are considered the gold standard for internal validity due to their ability to provide unbiased effect estimates for the sample at hand. Many of the strongest study designs found in the reviewed literature involved the randomization of antimalarial drugs. However, those studies are generally designed—and powered—to provide sufficient sample size to detect a difference in efficacy of the drugs, which means that many do not have sufficient statistical power to detect rare safety-related outcomes related to taking the drug.
For example, consider a situation in which a malaria infection rate is 200 out of every 1,000 people (20%) and an antimalarial drug reduces risk of malaria by 50% (so that the resultant infection rate is 10%). A study that enrolls 200 individuals and randomly assigns each to receive the antimalarial drug or placebo would have about 80% power to detect that effect. However, if the outcome of interest was a rare adverse event, such as one experienced by only 1 in every 10,000 people taking the antimalarial (versus 1 in 100,000 people not exposed to the drug), the study would need to enroll approximately 200,000 people in order to have 80% power to detect that difference in outcome rates. (Note, too, that rare outcomes—such as one occurring in just 1 of every 100,000 people—may be particularly uncommon in the samples enrolled in typical randomized trials to establish efficacy, as those individuals are often healthier than the general population.) Thus, even randomized trials that are sufficiently powered to detect their primary outcomes of interest may have limited power to detect differences in rare adverse events unless that was part of the original design of the study, with large numbers of individuals randomized. This also implies that for studying rare adverse events, large non-experimental studies may be more useful in terms of statistical power, although confounding and other biases then become a concern.
Given the impact of power considerations, it was critical for the committee to distinguish between studies that were small and did not detect differences in adverse events between treatment arms and studies that appeared to have had sufficient power to detect differences in outcomes if such differences did exist. In other words, a lack of observed association does not necessarily imply a lack of true association, especially if the studies were small and not designed to examine the outcomes under consideration.
In summary, in evaluating the weight and quality of evidence, especially when null findings are reported, it is important to consider whether a study was sufficiently powered to detect the associations of interest. While the statistical power is a function of multiple features of the study, notably study size and the
frequency of the outcome of interest, studies that have only sufficient power to detect very large effects (e.g., relative risks greater than 3) are of limited value, given that relative risks of smaller magnitude may have important implications.
Whenever individuals’ exposures to medications are measured in a study, there is the possibility of misclassification. To illustrate, people who experience an adverse health event may provide a more complete report of their current and past exposures to medicines. Similarly, people who receive a particular antimalarial believed to be associated with specific adverse events may be more likely to seek medical care for a given condition. There may also be important differences in the completeness and accuracy of the exposure data between various sources of information. Using only pharmacy claims or only dispensing records for determining exposure to a drug used to prevent a disease may lead to an overestimation of peoples’ exposure to a given drug, particularly if there is reason to believe that the drug is associated with acute adverse events. Moreover, prescription and dispensing data are not surrogates of actual use or adherence to the approved regimen. These are examples of differential misclassification of exposure that can lead to an overestimation or an underestimation of effects. Misclassification can also be nondifferential, as would be the case when the degree of misclassification is similar for all exposure groups and outcomes. An example would be a situation in which all study participants have similar difficulties completing questionnaires or remembering past exposures. Nondifferential misclassification of exposure tends to bias the study results toward the null (i.e., attenuating the strength of an association between a drug and outcome). Obtaining data from more than one source or verifying data by examining pre-existing records (e.g., medical records or pharmacy records) may help to reduce the misclassification of exposures.
If studies of antimalarial drugs are to make meaningful contributions, there should be either documentation of drug prescriptions with a high likelihood—if not certainty—of adherence or else self-report based on carefully designed questionnaires. Even these methods are fallible, but in most cases they provide sufficient quality to be considered contributory evidence.
Outcome misclassification occurs when individuals are placed into an incorrect category with respect to the outcome of interest. If the misclassification occurs differently for people with and without exposure to a drug, it is said to be differential misclassification, which may lead to an association between exposure to a drug of interest and an outcome being either exaggerated or underestimated. In nondifferential outcome misclassification, the misclassification is not related to exposure status (i.e., the use of a specific drug), and the effect estimates tend to
underestimate the true effect. The outcomes reported in the epidemiologic literature for the antimalarial drugs of interest generally fall into six categories: neurologic, psychiatric, gastrointestinal, ocular, cardiovascular, and other (depending on the drug, this category may include such things as dermatologic or biochemical outcomes). The assessment and diagnosis of conditions in each of these categories is dependent on different criteria, measures, and tests, some more objective than others. For example, whereas electrocardiograms are tests based on objective biologic indicators that can be used to diagnose certain cardiovascular conditions, structured clinical interviews are needed to diagnose psychiatric conditions.
In part because some of these health outcomes do not have biologically based diagnostic tests, such as mental health diagnoses and symptoms and some neurologic symptoms such as cognitive impairment (e.g., problems with memory, attention, or concentration) and headaches, the committee discussed the strength and validity of these outcomes as reported in the included studies. PTSD, an outcome specified in the committee’s Statement of Task, is a challenging condition to assess and report in epidemiologic studies. Clinically recorded diagnoses should be based on criteria from the Diagnostic and Statistical Manual of Mental Disorders (DSM) or the ICD, which require that a diagnosis should be made when trauma exposure is reported and that symptoms are in relation to a specific trauma. Self-reported diagnoses or symptom measurements do not usually have this requirement, making self-reported symptoms a less reliable measurement of PTSD. Generally, studies based on self-report measures fail to specifically connect PTSD symptoms to a specific traumatic event, as required by the DSM diagnostic formulation: Criterion A requires exposure to an event that was life-threatening or violent. Each of the subsequent symptom clusters (i.e., intrusion, avoidance, cognitive or emotional disturbance, or hyperarousal) must be experienced in relation to the traumatic event, and an exclusionary criterion is that the symptoms may not be due to medication. Because many studies do not link symptoms to an identified traumatic event, it is often difficult, if not impossible, to ascertain whether symptoms that are reported in the evaluated literature are the result of a medication-related experience, some other trauma, both, or neither, which lends uncertainty to the meaning of these outcomes when associations are found in populations of interest. Furthermore, because these symptoms and diagnoses are not linked directly to the experience of a specific traumatic event, it is unclear whether these symptoms or diagnoses are experienced in a timeframe that would make them likely to be related to the use of a particular medication.
An association between drug administration and other psychiatric outcomes, such as depression, suicidality, or psychotic experiences (e.g., hallucinations, delusions), is even harder to establish, for several different reasons. First, in the population of most relevance, service members, the age of exposure to antimalarials overlaps with the age of onset of many of the psychiatric symptoms of interest. Depression and symptoms of psychosis develop within the age window of the young adult population who are recruited to the military. The onset of psychiatric
symptoms may be coincident to exposure to medication, but a causal relationship would be difficult to establish. Second, military-related confounders introduce powerful effects on the adverse health outcomes of interest (see Confounding section below). Additionally, the lack of understanding of the biologic mechanisms of risk and resilience in these psychiatric experiences presents multiple challenges to establishing causal relationships between most risk factors and psychiatric outcomes. Furthermore, because many of these psychiatric symptoms have variable courses, from presenting and remitting quickly to multiple episodes of relapse and remission to consistent persistence, it is unclear how any intervening risk factor would affect the natural course of these symptoms.
The committee defined “persistent” outcomes as those present at least 28 days following cessation of a drug, which is appropriate for the case of PTSD, as PTSD is not diagnosed until at least 1 month following a Criterion A traumatic event. However, if the symptoms of PTSD are assessed years after cessation of a drug, yet they are reported in the absence of a direct connection to the experience of taking the drug or any other traumatic event, it is difficult to determine the etiology of those symptoms. For other conditions reported in the literature, onset may be acute, but the condition may persist for more than 28 days post-drug-cessation and may not resolve without treatment. This would pertain, for example, to certain ophthalmic conditions, such as cataract.
The committee recognizes that it is difficult to achieve an optimal assessment of neuropsychiatric endpoints in this literature. Psychiatric and neurologic symptoms should be assessed and documented by a trained assessor, using structured and psychometrically sound assessment tools. For example, an optimal method is to include lifetime psychiatric diagnoses using structured clinical interviews based on DSM-5 criteria (e.g., Structured Clinical Interview for DSM-5, or SCID), administered by a trained assessor, with special attention to and documentation of symptom onset and remission and their relationship to medication exposure. A SCID would make it possible to connect the PTSD symptoms to a particular potentially traumatic event. Because previous diagnoses of PTSD significantly raise the risk for subsequent diagnoses, determining the lifetime diagnoses that occurred prior to medication exposure, rather than just the current diagnoses, would allow for a more reliable control for this variable.
An important aspect of individual studies that must be considered when evaluating the quality of their methods is the attention paid to the potential for the results to reflect confounding bias rather than a true association. Part of an assessment of the potential for confounding is to examine any steps taken by the investigators to mitigate the impact of potential confounders. Confounding could occur, for example, if the use of antimalarial drugs for prophylaxis is associated with personal or situational attributes that may also predict the adverse outcome
under study. These personal or situational attributes are said to “confound” the association between the antimalarial drug use and the adverse outcome of interest. For example, a history of psychiatric problems is a contraindication to the use of some of the antimalarials of interest. This personal characteristic (the presence or absence of psychiatric problems) is also likely to be a predictor of future adverse psychiatric outcomes. If the investigators do not take this into account, then the results of the study may suggest that individuals taking a particular antimalarial are less likely to develop adverse psychiatric outcomes than a comparison group whose members have not taken the drug because those with a history of psychiatric illness will have been excluded from the antimalarial group but not from the comparison group. Furthermore, as contraindications are introduced over time, studies will differ in their susceptibility to this bias in relation to the altered prescribing practices. This example highlights the importance of careful consideration of the comparison group, as discussed above. If, in this example, individuals with a history of psychiatric problems were excluded from the comparison group, then the potential for confounding by a history of psychiatric problems would be removed.
Another illustration relates specifically to use of antimalarials in the military. Service-related characteristics may act as confounders when assessing the association between antimalarial use and psychiatric outcomes. Specifically, a confounding factor could be whether individuals were deployed or assigned to duties outside of the United States. The stressors associated with living and working outside of the country may themselves increase risk for adverse health outcomes, especially psychiatric outcomes. Exposure to combat areas is also likely to increase the risk of negative health outcomes. Service members most likely to be prescribed antimalarials are those who are assigned to duty outside of the United States, and possibly even in combat areas, and these confounders can exert strong effects on the risk for negative health outcomes before considering antimalarial exposure. The potential for confounding in this hypothetical example could be addressed by adjusting for deployment location and combat exposure in the statistical analysis. As noted with regard to study design, one of the ways in which studies can be informative is to limit confounding where possible by choosing a suitable comparison group to compare those taking the drug with those in the comparison group having roughly similar levels of strong influences on outcome such as contraindications (e.g., psychiatric history), combat exposure, or selection for favorable health status. It is also possible to control for confounding to some extent by measuring the characteristics that may differ between the exposed and comparison group and making statistical adjustments.
Effect modification, stemming from a potential presence of variables (known or unknown to the researcher) affecting the association between an
exposure (e.g., drug) and an outcome (e.g., PTSD), is highly prevalent in epidemiologic studies. Effect modification occurs when an exposure has different effects among different subgroups or levels of the effect modifier. Consequently, the magnitude of the association may vary across studies, based on the level or presence of such variables. A common solution to addressing effect modification is to examine the association separately for each level of a third variable (e.g., the level of education of the subjects). While helpful, this solution is dependent on whether the data concerning such variables are collected (e.g., genetic markers are rarely examined in epidemiologic studies), and the statistical power (i.e., how many subjects at each level) for such an examination are at all sufficient. Such factors as a previous history of malaria treatment, mental health problems, exposure to concurrent drugs, adherence to drug dosing and schedule, and previous concurrent stressors may contribute to effect modification.
At present, there is not sufficiently compelling information to make the consideration of effect modifiers essential to having a meaningful study (i.e., it has not been established with any certainty that subgroups in the population are more or less vulnerable to any persistent adverse effects associated with antimalarial use). Where information on effect modification is provided, the results may suggest considering that possibility and therefore be of some value.
In assessing biologic plausibility—defined by the committee as the existence of mechanisms observed in studies of experimental animals, cell cultures, or pathophysiology assessments that could account for the various adverse events observed in humans using the various antimalarial drugs of interest for prophylaxis—the committee required that published articles include objective tests of the impact of these drugs on endpoints relevant to potential pathologic processes. Outcomes were not limited to any specific organ or system, and reviewed studies included the exposure of experimental animals, cell lines, and, in some cases, human tissue or blood samples to antimalarial drugs. In assessing biologic plausibility for a particular outcome, the number of papers describing the same mechanistic endpoints associated with drug exposure was considered as an indicator of the validity of findings. Although various biochemical and pathologic endpoints and outcomes were considered (and they are discussed in the individual antimalarial drug chapters as appropriate), the committee also notes any limitations of these types of studies with regard to applicability to prophylaxis, how analogous the models and time courses of observation used are to humans, and how closely the drug dosing and concentrations correspond to those experienced in humans using these drugs for malaria prophylaxis.
Types of Populations Considered
The studies evaluated for this report were conducted in different populations. Although U.S. service members and veterans are the target population of interest, studies of other populations were also considered as contributing to the evidence base for associations between antimalarial use and persistent adverse events.
Military and Veteran Populations
Because people who are currently serving or who have served in the U.S. military are the target population of the charge to this committee, studies of these populations were accorded considerable weight in the committees’ deliberations and are presented first in the summaries of the identified literature for each drug. The committee reviewed all identified studies of U.S. and international service members and veterans that used any of the antimalarials of interest. In general, few studies included objective measures of drug chemical concentrations in the blood or tissue; those that are available were performed in small studies, usually to examine the pharmacokinetic and pharmacodynamic properties of a drug. Instead, the use of a particular antimalarial and its dosage for prophylaxis is based on self-report or, when observed by researchers or clinicians, as part of the study design. Often, full adherence to the drug regimen is assumed in estimating and quantifying the risk of specific adverse events and health outcomes related to the use of a particular drug, even though many studies have shown that individuals often fail to fully adhere to the regimen, especially when the drug is to be taken for long periods of time, introducing the potential for misclassification bias (Brisson and Brisson, 2012; Cunningham et al., 2014; Landman et al., 2014; Saunders et al., 2015). Consistent with other studies of health outcomes in military populations, when there are no actual measures of exposure to a specific chemical or group of toxicants, comparisons between deployed and nondeployed veterans are considered to be the next most relevant comparison. Since sending troops to known malaria-endemic areas without prevention measures when they are available would be unethical, several studies of military populations compare the effects of two or more different antimalarials. Because of the many other factors and stresses associated with deployed environments, including combat, specific effects attributable to the use of an antimalarial drug may be difficult to tease out.
Human Studies Among Non-Military or Veteran Populations
Although U.S. service members and veterans constitute the source population of interest, the committee has taken into account the potential for obtaining a more precise quantification and evaluation of the risks of adverse events
and health outcomes associated with the antimalarial drugs of interest in better characterized cohorts. Such cohorts include occupationally exposed workers (such as Peace Corps volunteers, Department of State officials, etc.), travelers and expats, research volunteers, people with adverse events reported to national or manufacturer registries, and people living in malaria-endemic areas. These populations use antimalarial drugs but do not have some of the same potentially confounding stressors such as combat. Studies of short-term travelers who were followed for at least 28 days post-drug-cessation and of long-term travelers and expats who visited or moved to malaria-endemic areas and used antimalarial drugs for prophylaxis provide additional evidence of health outcomes following exposure to the antimalarial drugs of interest that can supplement the studies of service members and veterans. In addition, safety and tolerance studies performed in healthy residents of non-endemic areas who were followed for at least 28 days post-drug-cessation were reviewed. Finally, studies of adverse events associated with the prophylactic use of a drug in a population with a specific underlying condition (such as pregnancy, comorbid conditions) or demographic trait are described as appropriate.
Animal and Mechanistic Studies
The committee used animal and mechanistic studies to determine whether there is evidence of a pathophysiologic process or biologic mechanism that could provide reasonable evidence to support a relationship between exposure to an antimalarial drug and a persistent health effect, as seen in studies of humans using the antimalarial drugs for prophylaxis. A positive statistical association between an exposure and an outcome does not necessarily mean that the exposure is the cause of that outcome. Data from toxicology studies may support or conflict with a hypothesis that a specific drug or chemical can contribute to the occurrence of a particular condition or disease. Insights about biologic processes inform whether an observed pattern of statistical association might be interpreted as the product of more than error, bias, confounding, or chance. Discussions on biologic plausibility are presented after the evidence in humans is presented and before the synthesis of all the evidence. The degree of biologic plausibility itself influences whether the committee perceives positive findings in human studies to be indicative of a pattern or the product of bias or chance statistical associations. Ultimately, the results of the toxicology studies should be consistent with what is known about the human disease process if they are to support a conclusion that the development or persistence of a condition or disease was influenced by an exposure.
Studies of laboratory animals and other systems (such as studies using cell lines or in vitro human or other mammalian cell cultures) are essential to understanding possible health effects when experimental research in humans is not ethically or practically possible (NRC, 1991). These types of studies form the basis
for much of what is known about the mechanisms behind the recognized biologic actions and effects of the drugs of interest. Studies in animal models can be used to characterize absorption, distribution, metabolism, elimination, and excretion of chemicals, and they may examine short-term or long-term exposures. Such studies permit a potentially toxic agent to be introduced under controlled conditions (with respect to dose, duration, and route of exposure) to probe the agent’s physiologic and psychologic effects on various body systems and potentially to identify the mechanisms by which the effects are produced.
To be considered an acceptable surrogate for the study of a human physiology, an animal model must reproduce, with some degree of fidelity, the physiologic manifestations observed in humans. While most drug actions are similar across mammals, a given effect of an exposure in one animal species does not necessarily establish its occurrence in humans, nor does the apparent absence of a particular effect in a model animal mean that the effect could not occur in humans. But while animal models are not always ideal replicates of human conditions, there are enough similarities between human and animal responses to many toxicants that animal models can be used to examine mechanism-of-action hypotheses. There are numerous examples of the effective use of animal models to predict drug toxicity and efficacy and ample evidence that critical physiologic and psychological processes are conserved across mammalian evolution (Olson et al., 2000; Uhl and Warner, 2015). Animal studies are a valuable complement to human studies of genetic susceptibility or other biomarkers, and they can facilitate the study of chemical mixtures and their potential interactions. The most commonly used experimental animal models for testing the potential toxicity of antimalarial drugs are mice, rats, dogs, and rhesus monkeys.
Although animal and cell-culture studies provide important information for understanding the biochemical and molecular mechanisms associated with the toxicity induced by drugs and chemicals, many factors may lead to differences between the results of controlled animal studies and the effects observed in humans. These factors, which must be considered when extrapolating their results to human disease and disease progression, include the magnitude and duration of exposure, namely to prophylaxis in humans; the timing of exposure during development or differentiation; the route of exposure (e.g., injections in model organisms versus oral administration in humans); model-specific factors (such as sex, genetic background, and stress); and differences in pharmacokinetics and pharmacodynamics across species as well as different formulations of the drug being administered (e.g., pure compounds versus additives in tablets and pills). Another challenge of using animal data to study the persistent effects of antimalarial drugs in humans is that certain symptoms, such as headache, nausea, and muscle and joint pain—which have been reported by some people who have used particular antimalarial drugs—are difficult to study with standard tests in animals (OTA, 1990).
In Vitro Studies
Defined broadly, in vitro studies are tests or assessments of toxicologic phenomena in tissue slices, isolated organs, isolated primary cell cultures, cell lines, and subcellular fractions such as those of mitochondria, microsomes, and even membranes (Srivastava et al., 2018). In vitro methods are routinely used because correlating the findings with in vivo studies can help in understanding a specific in vivo response in a given species. Studies that use in vitro methods may be informative, but such data must be viewed with caution regarding their relationship to the human experience because in vitro test systems are an extremely simplified form of very complex in vivo systems. In addition, in vitro analyses generally lack mechanisms to metabolize drug present in the whole organism. Therefore, the ability to extrapolate in vitro data to in vivo results is limited.
To assess the assembled evidence, committee members first reviewed and discussed draft text on group calls and at in-person meetings until they reached a consensus on the description and assessment of the studies. Then, using all of the available information, the full committee came to a consensus regarding the conclusion and, based on the strength of the evidence, assigned a category of association (discussed below) between prophylactic use of an antimalarial of interest and persistent or latent health effects. The committee adopted a policy of giving the most evidentiary weight to inform its conclusions to peer-reviewed, published literature. Although the process of peer review by fellow professionals ensures high standards of quality, it does not guarantee the validity of a study or the generalizability of its results. Accordingly, committee members read each study critically and considered its relevance and quality.
When drafting language for a conclusion, the committee considered the timing and duration of the exposures, the nature of the specific adverse events or health outcomes, the populations exposed, and the quality, precision, and consistency of the evidence examined. The conclusion does not take into account any information regarding the benefit of the antimalarial to either population or individual health. Although both primary and supporting studies contributed to the committee’s conclusion regarding the evidence of the prophylactic use of an antimalarial to be associated with a particular health condition or outcome, primary studies were given more weight. The committee did not use a formulaic approach to determining the number of primary or supporting studies that would be necessary to assign a specific category of association. Rather, the committee’s review required a thoughtful and nuanced consideration of all the studies as well as expert judgment, as provided by the complement of expertise represented on the committee, and this could not be accomplished by adherence to a narrowly prescribed formula of what data would be required for each category of association
or for a particular health outcome. The committee reviewed the data and made conclusions independently of other reports or author conclusions.
Categories of Association
A system of four categories of association to rate health outcomes according to the strength of the scientific evidence, which was adapted from those categories used by the International Agency for Research on Cancer, has gained wide acceptance by Congress, VA, researchers, and veterans groups and has been used in report series, including Veterans and Agent Orange (a 12-volume series) and Gulf War and Health (an 11-volume series), as well as several stand-alone reports on such topics as evaluations of vaccine safety and the adverse health outcomes of vaccines (IOM, 1991, 1994). The criteria for each of the four categories of association express a degree of confidence based on the extent to which bias and other sources of error could be reduced, and thus the quality of the evidence. The coherence of the full body of epidemiologic information, including supplemental evidence and biologic plausibility, was considered when the committee reached a judgment about association for a given outcome. As was the case with several committees that chose to use these categories of association, the Bradford Hill criteria for causality (Hill, 1965) was not applied as a checklist for strength-of-association assessments because those nine factors are not a definitive set of elements for assessing causality and they vary in the importance or weight that might be assigned to each. The committee discussed the evidence and reached consensus on the categorization of the evidence for persistent or latent health effects for each drug of interest, and these conclusions appear in the Synthesis and Conclusions section for each drug-specific chapter. If the evidence permitted, more specific conclusions were made regarding the use of an antimalarial and a particular outcome or group of outcomes. Implicit in these categories is that “the absence of evidence is not evidence of absence.” That is, based on the currently available literature that met the committee’s criteria for inclusion, a lack of informative data does not mean that there is no increased risk of a specific adverse event, only that the available evidence does not support claims of an increased risk. As the adverse events generally fall into six categories—neurologic, psychiatric, gastrointestinal, eye, cardiovascular, and other disorders—a conclusion is made for each category as appropriate. The four categories of association and the criteria for each follow. Each conclusion consists of two parts: the first sentence provides the category of association, and the second sentence offers a conclusion regarding whether further research in a particular area is merited based on any signals from all the currently available evidence reviewed for that outcome (assessed epidemiologic studies that reported outcomes at least 28 days post-drug-cessation, studies of concurrent adverse events, case reports, data from selected subpopulations, FDA labels, and biologic plausibility). For those health outcomes for which the committee concluded there is not a clear justification for additional research, the intention was to
distinguish those issues for which there is presently an empirical basis for looking more closely and those for which such a basis is not present. As more research accumulates, the outcomes that warrant further research may change.
Sufficient Evidence of an Association
For effects to be classified as having “sufficient evidence of an association,” a positive association between the prophylactic use of an antimalarial drug and the outcome must be observed in studies in which chance, bias, and confounding can be ruled out with reasonable confidence. For example, the committee might regard evidence from several small studies without known bias and confounding and that show an association that is consistent in magnitude and direction to be sufficient evidence of an association. Experimental data supporting the biologic plausibility of an association strengthen the likelihood of an association but are not a prerequisite and are not enough to establish an association without corresponding epidemiologic findings.
Limited or Suggestive Evidence of an Association
For health outcomes in the category of “limited or suggestive evidence of an association,” the evidence must suggest an association between the prophylactic use of an antimalarial drug of interest and the outcome in studies of humans, but the evidence can be limited by an inability to confidently rule out chance, bias, or confounding. Typically, at least one high-quality study indicates a positive association, but the results of other studies could be inconsistent. Because there are a number of agents of concern whose toxicity profiles are not expected to be uniform—specifically, the antimalarial drugs of interest—apparent inconsistencies can be expected among study populations that have experienced different exposures. Even for a single exposure, a spectrum of results would be expected, depending on the power of the studies, the inherent biologic relationships, and other study design factors.
Inadequate or Insufficient Evidence to Determine an Association
By default, any health outcome is placed in the category of “inadequate or insufficient evidence to determine an association” before enough reliable scientific data have accumulated to promote it to the category of sufficient evidence or limited or suggestive evidence of an association or to move it to the category of limited or suggestive evidence of no association. In this category, the available human studies may have inconsistent findings or be of insufficient quality, validity, consistency, or statistical power to support a conclusion regarding the presence of an association. Such studies might have failed to control for confounding factors or might have had inadequate assessment of exposure. Because the committee
could not possibly address every rare condition or disease, it does not draw explicit conclusions about outcomes that are not discussed, and thus, this category is the default or starting point for any health outcome. If a condition or outcome is not addressed specifically, then it will be in this category.
Limited or Suggestive Evidence of No Association
The category of “limited or suggestive evidence of no association” was originally defined for health outcomes for which several adequate studies covering the “full range of human exposure” were consistent in showing no association or reduced risk (not distinguished for the purposes of this evaluation, which was focused on the potential for adverse effects) with an exposure of interest at any concentration, with the studies having relatively narrow confidence intervals. A conclusion of “no association” is inevitably limited to the conditions, exposures, and observation periods covered by the available studies, and the possibility of a small increase in risk related to the magnitude of exposure studied can never be excluded. However, a change in classification from inadequate or insufficient evidence of an association to limited or suggestive evidence of no association would require new studies that correct for the methodologic problems of previous studies and that have samples large enough to limit the possible study results attributable to chance.
Ackert, J., K. Mohamed, J. S. Slakter, S. El-Harazi, A. Berni, H. Gevorkyan, E. Hardaker, A. Hussaini, S. W. Jones, G. C. K. W. Koh, J. Patel, S. Rasmussen, D. S. Kelly, D. E. Baranano, J. T. Thompson, K. A. Warren, R. C. Sergott, J. Tonkyn, A. Wolstenholme, H. Coleman, A. Yuan, S. Duparc, and J. A. Green. 2019. Randomized placebo-controlled trial evaluating the ophthalmic safety of single-dose tafenoquine in healthy volunteers. Drug Saf 42(9):1103-1114.
Andersen, S. L., A. J. Oloo, D. M. Gordon, O. B. Ragama, G. M. Aleman, J. D. Berman, D. B. Tang, M. W. Dunne, and G. D. Shanks. 1998. Successful double-blinded, randomized, placebo-controlled field trial of azithromycin and doxycycline as prophylaxis for malaria in western Kenya. Clin Infect Dis 26(1):146-150.
Brisson, M., and P. Brisson. 2012. Compliance with antimalaria chemoprophylaxis in a combat zone. Am J Trop Med Hyg 86(4):587-590.
Cohen, J. 1988. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates.
Cunningham, J., J. Horsley, D. Patel, A. Tunbridge, and D. G. Lalloo. 2014. Compliance with long-term malaria prophylaxis in British expatriates. Travel Med Infect Dis 12(4):341-348.
DeSouza, J. M. 1983. Phase I clinical trial of mefloquine in Brazilian male subjects. Bull World Health Organ 61(5):809-814.
Eick-Cost, A. A., Z. Hu, P. Rohrbeck, and L. L. Clark. 2017. Neuropsychiatric outcomes after mefloquine exposure among U.S. military service members. Am J Trop Med Hyg 96(1):159-166.
Friedman, G. D. 2015. Primer of epidemiology. New York: McGraw-Hill.
Gagnier J. J., G. Kienle, D. G. Altman, D. Moher, H. Sox, D. Riley; and the CARE Group. n.d. The CARE guidelines: Consensus-based clinical case reporting guideline development. Glob Advs Health Med 2(5):38-43.
Gordis, L. 2004. Epidemiology. Philadelphia, PA: Elsevier Saunders.
Green, J. A., A. K. Patel, B. R. Patel, A. Hussaini, E. J. Harrell, M. J. McDonald, N. Carter, K. Mohamed, S. Duparc, and A. K. Miller. 2014. Tafenoquine at therapeutic concentrations does not prolong Fridericia-corrected QT interval in healthy subjects. J Clin Pharmacol 54:995-1005.
Hill, A. B. 1965. The environment and disease: Association or causation? Proc J R Soc Med 58:295-300.
IOM (Institute of Medicine). 1991. Adverse effects of pertussis and rubella vaccines. Washington, DC: National Academy Press.
IOM. 1994. Adverse events associated with childhood vaccines: Evidence bearing on causality. Washington, DC: National Academy Press.
IOM. 2007. Adverse drug event reporting: The roles of consumers and health-care professionals: Workshop summary. Washington, DC: The National Academies Press.
Kraemer, H. C., and C. Blasey. 2015. How many subjects? Statistical power analysis in research. Thousand Oaks, CA: SAGE Publications.
Landman, K. Z., K. R. Tan, P. M. Arguin, and Centers for Disease Control and Prevention. 2014. Knowledge, attitudes, and practices regarding antimalarial chemoprophylaxis in U.S. Peace Corps volunteers—Africa, 2013. MMWR 63(23):516-517.
Laothavorn, P., J. Karbwang, K. Na Bangchang, D. Bunnag and T. Harinasuta. 1992. Effect of mefloquine on electrocardiographic changes in uncomplicated falciparum malaria patients. Southeast Asian J Trop Med Public Health 23(1):51-54.
Leary, K. J., M. A. Riel, M. J. Roy, L. R. Cantilena, D. Bi, D. C. Brater, C. van de Pol, K. Pruett, C. Kerr, J. M. Veazey, Jr., R. Beboso, and C. Ohrt. 2009. A randomized, double-blind, safety and tolerability study to assess the ophthalmic and renal effects of tafenoquine 200 mg weekly versus placebo for 6 months in healthy volunteers. Am J Trop Med Hyg 81:356-362.
Lee, T. W., L. Russell, M. Deng, and P. R. Gibson. 2013. Association of doxycycline use with the development of gastroenteritis, irritable bowel syndrome and inflammatory bowel disease in Australians deployed abroad. Intern Med J 43(8):919-926.
Lege-Oguntoye, L., G. C. Onyemelukwe, B. B. Maiha, E. O. Udezue, and S. Eckerbom. 1990. The effect of short-term malaria chemoprophylaxis on the immune response of semi-immune adult volunteers. East Afr Med J 67(11):770-778.
Meier, C. R., K. Wilcock, and S. S. Jick. 2004. The risk of severe depression, psychosis or panic attacks with prophylactic antimalarials. Drug Saf 27(3):203-213.
Miller, A. K., E. Harrell, L. Ye, S. Baptiste-Brown, J. P. Kleim, C. Ohrt, S. Duparc, J. J. Möhrle, A. Webster, S. Stinnett, A. Hughes, S. Griffith, and A. P. Beelen. 2013. Pharmacokinetic interactions and safety evaluations of coadministered tafenoquine and chloroquine in healthy subjects. Br J Clin Pharmacol 76:858-867.
Nasveld, P. E., M. D. Edstein, M. Reid, L. Brennan, I. E. Harris, S. J. Kitchener, P. A. Leggat, P. Pickford, C. Kerr, C. Ohrt, W. Prescott, and the Tafenoquine Study Team. 2010. Randomized, double-blind study of the safety, tolerability, and efficacy of tafenoquine versus mefloquine for malaria prophylaxis in nonimmune subjects. Antimicrob Agents Chemother 54:792-798.
NRC (National Research Council). 1991. Animals as sentinels of environmental health hazards. Washington, DC: National Academy Press.
Olson, H., G. Betton, D. Robinson, K. Thomas, A. Monro, G. Kolaja, P. Lilly, J. Sanders, G. Sipes, W. Bracken, M. Dorato, K. Van Deun, P. Smith, B. Berger, and A. Heller. 2000. Concordance of the toxicity of pharmaceuticals in humans and in animals. Regul Toxicol Pharmacol 32(1):56-67.
OTA (Office of Technology Assessment). 1990. Neurotoxicity: Identifying and controlling poisons of the nervous system. Washington, DC: U.S. Government Printing Office.
Rodgers, M., S. Thomas, M. Harden, G. Parker, A. Street, and A. Eastwood. 2016. Developing a methodological framework for organisational case studies: A rapid review and consensus development process. HS&DR 4(1).
Rothman, K. J., S. Greenland, and T. L. Lash. 2012. Modern epidemiology. 3rd ed. Philadephia, PA: Lippincott Williams & Wilkins.
Rueangweerayut, R., G. Bancone, E. J. Harrell, A. P. Beelen, S. Kongpatanakul, J. J. Möhrle, V. Rousell, K. Mohamed, A. Qureshi, S. Narayan, N. Yubon, A. Miller, F. H. Nosten, L. Luzzatto, S. Duparc, J.-P. Kleim, and J. A. Green. 2017. Hemolytic potential of tafenoquine in female volunteers heterozygous for glucose-6-phosphate dehydrogenase (G6PD) deficiency (G6PD Mahidol variant) versus G6PD normal volunteers. Am J 1443 Trop Med Hyg 97(3):702-711.
Saunders, D. L., E. Garges, J. E. Manning, K. Bennett, S. Schaffer, A. J. Kosmowski, and A. J. Magill. 2015. Safety, tolerability, and compliance with long-term antimalarial chemoprophylaxis in American soldiers in Afghanistan. Am J Trop Med Hyg 93(3):584-590.
Savitz, D. A., and G. A. Wellenius. 2016. Interpreting epidemiologic evidence. New York: Oxford University Press.
Schlagenhauf, P., R. Steffen, H. Lobel, R. Johnson, R. Letz, A. Tschopp, N. Vranjes, Y. Bergqvist, O. Ericsson, U. Hellgren, L. Rombo, S. Mannino, J. Handschin, and D. Sturchler. 1996. Mefloquine tolerability during chemoprophylaxis: Focus on adverse event assessments, stereochemistry and compliance. Trop Med Int Health 1(4):485-494.
Schneider, C., M. Adamcova, S. S. Jick, P. Schlagenhauf, M. K. Miller, H. G. Rhein, and C. R. Meier. 2013. Antimalarial chemoprophylaxis and the risk of neuropsychiatric disorders. Travel Med Infect Dis 11(2):71-80.
Schneider, C., M. Adamcova, S. S. Jick, P. Schlagenhauf, M. K. Miller, H. G. Rhein, and C. R. Meier. 2014. Use of anti-malarial drugs and the risk of developing eye disorders. Travel Med Infect Dis 12(1):40-47.
Schneiderman, A. I., Y. S. Cypel, E. K. Dursa, and R. Bossarte. 2018. Associations between use of antimalarial medications and health among U. S. veterans of the wars in Iraq and Afghanistan. Am J Trop Med Hyg 99(3):638-648.
Schwartz, E., and G. Regev-Yochay. 1999. Primaquine as prophylaxis for malaria for nonimmune travelers: A 3062 comparison with mefloquine and doxycycline. Clin Infect Dis 29(6):1502-1506.
Srivastava, S., S. Mishra, J. Dewangan, A. Divakar, P. K. Pandey, and S. K. Rath. 2018. Chapter 2: Principles for in vitro toxicology. In A. Dhawan and S. Kwon (eds.), In vitro toxicology. New York: Academic Press. Pp. 21-44.
Tan, K. R., S. J. Henderson, J. Williamson, R. W. Ferguson, T. M. Wilkinson, P. Jung. and P. M. Arguin. 2017. Long term health outcomes among returned Peace Corps volunteers after malaria prophylaxis, 1995-2014. Travel Med Infect Dis 17:50-55.
Uhl, E. W., and N. J. Warner. 2015. Mouse models as predictors of human responses: Evolutionary medicine. Curr Pathobiol Rep 3(3):219-223.
Vandenbroucke, J. P. 1999. Case reports in an evidence-based world. J R Soc Med 92(4):159-163.
Walsh, D. S., C. Eamsila, T. Sasiprapha, S. Sangkharomya, P. Khaewsathien, P. Supakalin, D. B. Tang, P. Jarasrumgsichol, C. Cherdchu, M. D. Edstein, K. H. Rieckmann, and T. G. Brewer. 2004. Efficacy of monthly tafenoquine for prophylaxis of Plasmodium vivax and multidrug-resistant P. falciparum malaria. J Infect Dis 190(8):1456-1463.
Wells, G. A., B. Shea, D. O’Connell, J. Peterson, V. Welch, M. Losos, and P. Tugwell. 2019. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomized studies in meta-analyses. http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp (accessed October 29, 2019).
Wells, T. S., T. C. Smith, B. Smith, L. Z. Wang, C. J. Hansen, R. J. Reed, W. E. Goldfinger, T. E. Corbeil, C. N. Spooner, and M. A. Ryan. 2006. Mefloquine use and hospitalizations among US service members, 2002-2004. Am J Trop Med Hyg 74(5):744-749.
This page intentionally left blank.