As part of its mission to provide military forces, the US Department of Defense (DOD) must anticipate, defend, and safeguard its personnel against chemical threats. Many factors can determine whether a chemical agent could pose a threat, and toxicity clearly is one of them. To assess toxicity, DOD has relied primarily on traditional toxicity testing in which adverse biological responses are measured in laboratory animals that are exposed to high doses of a test agent. The traditional approaches, however, are expensive and time-intensive, raise questions about the applicability of results to human populations, raise concerns about animal welfare, and are impractical for evaluating quickly large numbers of chemicals that could be used against deployed forces. In recent years, various agencies and organizations have attempted to incorporate advances in systems biology, toxicogenomics, bioinformatics, and computational toxicology to develop cost-effective approaches for predicting chemical toxicity. Given the recent advances and developments in toxicity-testing methods and approaches, DOD asked the National Research Council (NRC) to determine the feasibility of developing a toxicity-testing program that uses modern approaches to identify acutely toxic agents rapidly that are relevant to DOD.1 In response to that request, the NRC convened the Committee on Predictive-Toxicology Approaches for Military Assessments of Acute Exposure, which prepared the present report.
CONCEPTUAL FRAMEWORK AND STRATEGY
As requested by DOD, the committee developed an overall conceptual approach that uses modern approaches for predicting acute, debilitating chemical toxicity. Its approach consisted of three components: (1) a conceptual framework that links chemical structure, physicochemical properties, biochemical properties, and biological activity to acute toxicity; (2) a suite of databases, assays, models, and tools that are based on modern in vitro, nonmammalian in vivo, and in silico approaches that are applicable for prediction of acute toxicity; and (3) a tiered prioritization strategy for using databases, assays, models, and tools to predict acute toxicity in a manner that balances the need for accuracy and timeliness. The committee based its conceptual framework (Figure S-1) on the premise that whole-animal toxicity can be predicted by using information about lower levels of complexity, even down to the level of chemical structure. Specifically, it is hypothesized that chemical structure, physicochemical properties, biochemical properties, or biological activity in isolated cells and tissues or in nonmammalian organisms can be used to predict acute mammalian toxicity.
The prioritization strategy was formulated on the basis of DOD’s stated need to understand the relative threat of the growing list of registered chemical substances. Although the committee cannot prescribe exactly how to manage various policy tradeoffs, such as the tolerance for false negatives and the timeframe required for identifying important hazards, it recommends a tiered prioritization strategy (Figure S-2) that applies increasingly complex approaches to place chemicals into three categories: high confidence of low toxicity, high confidence of high toxicity, and uncertain toxicity because of data inadequacy. The first category allows some chemicals to be deselected on the basis of low acute toxicity, and the emphasis on high confidence indicates a low tolerance for false negatives. The second category allows chemicals to be “selected” on the basis of high acute toxicity, and the emphasis on high confidence indicates the need to focus rapidly on chemi-
1The verbatim statement of task is provided in Chapter 1 of this report.
cals that might pose a risk. The third category represents chemicals that would move to the next tier. Chemicals could be deselected at any stage by considering other factors, such as chemical availability and weaponizability, that could eliminate them from further consideration. As illustrated in the figure and discussed further in the sections that follow, the testing strategy proceeds through a number of tiers that are successively more predictive and resource-intensive, from initial characterization (Tier 0) to nontesting approaches (Tier 1) to high-throughput and medium-throughput assays (Tier 2) and ultimately to traditional animal testing (Tier 3). Progression through the tiers requires intermediate integration steps that consider the diversity of data both within a tier and across tiers. At each tier, DOD will need to develop policies that are relevant to its mission on how to assign chemicals to various categories and to determine the extent of end-point coverage that is adequate for it to make reliable decisions. The committee notes that an end point could be a clinical outcome or a molecular initiating event. If science advances in such a way that adverse outcome pathways of interest to DOD are known, the strategy shown in Figure S-2 could rely on nontesting and biological assay-based approaches that evaluate molecular initiating events or measurable key events in the pathways.
NONTESTING APPROACHES FOR PREDICTING ACUTE TOXICITY
The committee envisions that nontesting approaches will be an important component of its conceptual framework. Nontesting approaches range from grouping chemicals that are structurally similar to developing quantitative structure–activity relationship (QSAR) models. The underlying assumption of nontesting approaches is that chemical properties that determine how a chemical will interact with a defined biological system are inherent in its molecular structure and thus that structurally similar chemicals should have similar biological activity. The starting point in the application of any nontesting approach is to search for and evaluate information on the chemical of interest. That step constitutes Tier 0 in the committee’s proposed strategy (Figure S-2).
As would be expected, information on physical properties, solvation properties, and molecular attributes (physicochemical data) is critical. Physicochemical data can be used to predict a chemical’s physical hazard, its reactivity, and its pharmacokinetics, including absorption by different exposure routes, distribution in the body, and likely metabolites. Physicochemical data can be obtained from the literature, derived experimentally, or predicted with various in silico techniques. However, many tools that can be used to predict physicochemical properties have limited chemical applicability; that is, they are most applicable for small organic chemicals.
Nontesting approaches have been used to predict acute toxicity. Specifically, a few (Q)SAR models have been developed for predicting in vivo acute toxicity.2 Most have focused on the prediction of acute rodent oral toxicity, such as estimation of oral LD50 values;3 few attempts have been made to derive models for acute toxicity via other exposure routes, such as inhalation and dermal exposure. Nontesting approaches also have been used to predict toxicity end points, such as neurotoxicity or cytotoxicity. More recent efforts have investigated the integration of in vitro assay data with nontesting approaches to strengthen predictions. Key issues with nontesting (and all other) approaches are their relevance and applicability for the broad array of chemicals of interest to
2The committee uses the shorthand notation (Q)SAR to indicate both SAR and QSAR.
3An LD50 value is the dose at which 50% of the population dies.
DOD and the reliability and validity of the data used to develop the models. Furthermore, the exposure routes of interest to DOD are most likely inhalation and dermal exposure, and few nontesting approaches address these exposure routes.
BIOLOGICAL ASSAYS FOR PREDICTING ACUTE TOXICITY
In vitro assays and nonmammalian in vivo assays are important components of the committee’s conceptual framework. Numerous screening assays have been developed to measure specific biological activities. The various assay types are described below with some key limitations noted for DOD’s purposes.
- Specific-Protein Assays. Many enzyme and receptor-binding assays have been developed to examine specific mechanisms of action at the molecular level. Some—such as ones that measure chemical-induced inhibition of acetylcholinesterase activity, altered electron transport in mitochondria, and modulation of ion-channel activity—might be relevant for predicting acute toxicity. Although the protein assays hold some promise, a key limitation is that acute toxicity that is not mediated by chemical action on specific enzymes or receptors will go undetected in these types of assays.
- Cell-Based Phenotypic Assays. These assays typically use cultured cells and measure some overall phenotypic output relevant to predicting acute toxicity, such as cellular proliferation, plasma membrane permeability, and adenosine triphosphate content. There is a growing literature on their application as toxicity screens, especially in drug development. Cell-based assays, particularly ones for evaluating cytotoxicity, have demonstrated success in predictive toxicology. A key limitation of cytotoxicity assays is that they do not provide data on some of the most important toxic mechanisms, specifically ones that involve organ-specific or cell-type–specific physiology. Another limitation of many existing cell-based assays is that they rely on immortalized cell lines that have little metabolic capability.
- Organotypic Models. Organotypic models more closely mimic the anatomy of organs and have been developed for the skin, eye, lung, liver, and central nervous system. They are especially attractive given their theoretical potential to model metabolism, biodistribution, and biological activity of a chemical in an in vitro system. However, the science of modeling human organs in a culture dish accurately, especially in formats suitable for high-throughput testing, and its application to toxicology are still in their infancy.
- Nonmammalian in vivo Assays. In addition to in vitro assays, the committee envisions nonmammalian animal models as a potentially important component of its conceptual framework. Traditional whole-animal assays have been crucial in understanding how chemicals affect metabolism and exhibit pathology at the cell and organ level. However, traditional assays are often expensive, require large amounts of chemicals, and cannot be adapted to even a medium-throughput format. For those and other reasons, alternative animal models have been developed. Ones that are potentially valuable for adapting to high-throughput screening rely on the fruit fly (Drosophila melanogaster), a nematode (Caenorhabditis elegans), and the zebrafish (Danio rerio). One particular advantage of the alternative models is the ability to identify whole-organism or organ-level responses. However, as with all animal models, a key limitation is related to species differences and use of resulting data to extrapolate to human responses. Furthermore, measuring some end points with alternative animal models has lower throughput than many in vitro assays, and little is known about their applicability to the assessment of acute toxicity of chemicals that are relevant to DOD.
In vitro assays, alternative animal models, and other emerging technologies described here and in more detail later in the committee’s report hold promise, but some important limitations or considerations should be noted. First, in vitro assays for predicting acute toxicity have focused
primarily on nonmechanistic indicators of toxicity, such as cytotoxicity; they were not developed with a quantitative linkage to any phenotype (acute or chronic). Second, existing assays focus on oral exposure; there has been little consideration of dermal or inhalation exposure. Third, most current in vitro assays do not account for important pharmacokinetic characteristics, such as metabolism, that can influence in vivo toxicity. Fourth, the nominal chemical concentration used in the assays is not necessarily representative of the concentration at which chemical bioactivity is observed. Fifth, cellular systems commonly use immortalized cancer cell lines, which might fail to detect chemical activity or effects that might occur in normal (nontumor) differentiated cells. Sixth, cells can have different levels of activity or responsiveness, depending on whether they are primary cells, differentiated cells, or immortalized cells and on how many times they have been cultured, so assay reproducibility can be a problem. Seventh, interpreting activity or effective concentrations that result from a high-throughput screening assay can be difficult because activity at high concentrations could represent nonspecific effects and offer little information about specific bioactivity. Conversely, the absence of activity could mean that the tested concentration is below the in vitro effective concentration, that the assay does not represent the biological target, or that there are problems with assay reliability. Current efforts in high-throughput screening support the observations noted here, and the committee emphasizes that DOD should use the experience from current high-throughput screening programs to design its screening program to predict acute, debilitating toxicity.
INTEGRATION AND DECISION-MAKING FOR PREDICTIVE TOXICOLOGY
A robust integration and decision-making strategy is needed as part of the committee’s suggested tiered prioritization strategy (shown in Figure S-2). As noted, the goal of each tier is to place a chemical into one of three categories: high confidence of high toxicity, high confidence of low toxicity, or inadequate data. That activity will require integrating various data streams and predictions that inform a single acute-toxicity end point (“within–end-point” integration and decision-making) and integrating predictions from several acute-toxicity end points (“cross–endpoint” integration and decision-making). The committee’s report discusses various methods for integrating data and predictions. Key tasks for DOD will be to define the most informative end points for its purpose (for example, neurotoxicity vs seizures), to set boundaries or toxicity thresholds for what is considered “high” or “low” toxicity for each end point, and to specify the level of confidence needed to make determinations.
One simple approach for integrating multiple end points is to summarize the categorization results for each end point in a “scorecard.” Each end point would be evaluated as to whether the chemical exhibited “high toxicity,” “low toxicity,” or “inadequate data.” A chemical would then be assigned to a “high toxicity overall” bin if at least one of the end points scored as “high toxicity,” a “low toxicity overall” bin only if all the end points scored as “low toxicity,” and an “inadequate data overall” bin if neither of the first two conditions is met. That simple approach has the advantage of retaining the end-point–specific information to inform future data generation. It is also consistent with a low tolerance for false negatives in that each end point serves as sufficient evidence to assign a chemical to a “high toxicity overall” bin.
It is possible to use more complex recombination approaches that would not depend strictly on a simple decision rule related to the categories for each end point. For example, one approach would be to provide a summary measure that consisted of a weighted sum of individual toxicity end points. Even if each individual end point is rated as “inadequate data,” it is conceivable that the presence of multiple end points close to their corresponding toxicity thresholds would permit a chemical to be categorized as “high” or “low” on the basis of the summary measure. Setting up appropriate decision rules would be a key policy question for DOD if it chose to go forward with implementing the committee’s suggested approach for predicting acute, debilitating toxicity.
LESSONS LEARNED AND NEXT STEPS
Several large-scale initiatives have been evaluating in vitro testing methods for their ability to predict human toxicity, and the committee considered them as it debated the feasibility of a predictive testing program for DOD. The US Environmental Protection Agency (EPA) ToxCast program and the European ACuteTox program demonstrate that in vitro assays have some value for predicting acute toxicity and provide evidence that an in vitro screening approach is feasible for evaluating the relative threat of a chemical as an acute hazard. However, most of the assays developed and validated for high-throughput screening programs were not developed specifically for acute-toxicity testing and so might be of little use for identifying chemicals that have the potential to cause acute, debilitating injuries in deployed military personnel. Lessons learned from those programs, however, could provide a great deal of guidance to DOD in its designing a system that uses high-throughput screening and predictive models to evaluate acute toxicity.
On the basis of its review, the committee notes several initial steps that DOD could take to implement the tiered prioritization strategy. First, an investment by DOD in computational and high-throughput screening could yield benefits in characterizing the toxicity of chemicals on which there are few or no toxicity data. Computational methods for predicting acute toxicity are seeing steady growth, and high-throughput screening might prove useful in excluding chemicals that have low toxic potential and in identifying toxic chemicals of greater concern for further testing. Second, there are data to suggest that DOD could use simple cytotoxicity assays to identify chemicals that have low acute-toxicity potential and focus its attention on chemicals that are more toxic. Additional investment would be required to determine whether the assays are relevant for identifying highly toxic chemicals that could be used against deployed troops. Third, the development of targeted mechanistically based assays could provide DOD with a useful resource for understanding and predicting potential toxicity of chemicals; specifically, having explicit knowledge of the mechanisms of action that lead to acute systemic toxicity would be valuable in the design and validation of integrated prediction methods. Completing the steps described here might require DOD to use a variety of reference chemicals, including chemicals of concern, to benchmark the results. Moreover, completing these steps will be facilitated by selecting well-characterized chemicals that can be used to evaluate the predictiveness of DOD’s in vitro assays and approaches against in vivo experimental results.
The committee anticipates that in the next 3–10 years any tiered testing approach will not be able to replace fully the need for targeted mammalian in vivo studies to confirm the toxicity of a chemical of interest. Indeed, the state of the science suggests that development of a predictive acute-toxicity program will require extensive DOD investment in computational modeling approaches, assay development, methods for extrapolation of in vitro results to in vivo conditions, and data-integration methods. To begin the investment, the committee recommends that DOD initiate pilot studies that evaluate chemical classes of highest concern with well-characterized reference chemicals. The pilot studies would allow DOD to develop the novel assays and tools needed to predict acute chemical toxicity efficiently and accurately and to evaluate the rate of false negatives and false positives. The pilot studies could also examine how generalizable the results of various assays and tools are from one chemical class4 to another. That research would allow DOD to begin to address the size of the chemical space needed to make predictions about unknown chemicals. The committee emphasizes that DOD could benefit from leveraging its efforts with other federal activities, such as EPA’s ToxCast program. Such collaboration would allow DOD to complete pilot studies more rapidly and maximize the return on its investment.
4In this context, chemical class is used broadly to include structurally related chemicals, chemicals that have different mechanisms of action, and chemicals that have different toxic end points, such as hepatotoxicity and neurotoxicity.