The need to confront uncertainty in risk assessment has changed little since the 1983 NRC report Risk Assessment in the Federal Government. That report found that:
The dominant analytic difficulty [in decision-making based on risk assessments] is pervasive uncertainty. … there is often great uncertainty in estimates or the types, probability, and magnitude of health effects associated with a chemical agent of the economic effects of a proposed regulatory action, and of the extent of current and possible future human exposures. These problems have no immediate solutions, given the many gaps in our understanding of the causal mechanisms of carcinogenesis and other health effects and in our ability to ascertain the nature or extent of the effects associated with specific exposures.
Those gaps in our knowledge remain, and yield only with difficulty to new scientific findings. But a powerful solution exists to some of the difficulties caused by the gaps: the systematic analysis of the sources, nature, and implications of the uncertainties they create.
Context Of Uncertainty Analysis
EPA decision-makers have long recognized the usefulness of uncertainty analysis. As indicated by former EPA Administrator William Ruckelshaus (1984):
First, we must insist on risk calculations being expressed as distributions of estimates and not as magic numbers that can be manipulated without regard to
what they really mean. We must try to display more realistic estimates of risk to show a range of probabilities. To help do this, we need new tools for quantifying and ordering sources of uncertainty and for putting them into perspective.
Ten years later, however, EPA has made little headway in replacing a risk-assessment "culture" based on "magic numbers" with one based on information about the range of risk values consistent with our current knowledge and lack thereof.
As we discuss in more depth in Chapter 5, EPA has been skeptical about the usefulness of uncertainty analysis. For example, in its guidance to those conducting risk assessments for Superfund sites (EPA, 1991f), the agency concludes that quantitative uncertainty assessment is usually not practical or necessary for site risk assessments. The same guidance questions the value and accuracy of assessments of the uncertainty, suggesting that such analyses are too data-intensive and "can lead one into a false sense of certainty."
In direct contrast, the committee believes that uncertainty analysis is the only way to combat the "false sense of certainty," which is caused by a refusal to acknowledge and (attempt to) quantify the uncertainty in risk predictions.
This chapter first discusses some of the tools that can be used to quantify uncertainty. The remaining sections discuss specific concerns about EPA's current practices, suggest alternatives, and present the committee's recommendations about how EPA should handle uncertainty analysis in the future.
Nature Of Uncertainty
Uncertainty can be defined as a lack of precise knowledge as to what the truth is, whether qualitative or quantitative. That lack of knowledge creates an intellectual problemthat we do not know what the "scientific truth" is; and a practical problemwe need to determine how to assess and deal with risk in light of that uncertainty. This chapter focuses on the practical problem, which the 1983 report did not shed much light on and which EPA has only recently begun to address in any specific way. This chapter takes the view that uncertainty is always with us and that it is crucial to learn how to conduct risk assessment in the face of it. Scientific truth is always somewhat uncertain and is subject to revision as new understanding develops, but the uncertainty in quantitative health risk assessment might be uniquely large, relative to other science-policy areas, and it requires special attention by risk analysts. These analysts need to allow questions such as: What should we do in the face of uncertainty? How should it be identified and managed in a risk assessment? How should an understanding of uncertainty be forwarded to risk managers, and to the public? EPA has recognized the need for more and better uncertainty assessment (see EPA memorandum in Appendix B), and other investigators have begun to make substantial progress with the difficult computations that are often required (Monte Carlo
methods, etc.). However, it appears that these changes have not yet affected the day-to-day work of EPA.
Some scientists, mirroring the concerns expressed by EPA, are reluctant to quantify uncertainty. There is concern that uncertainty analysis could reduce confidence in a risk assessment. However, that attitude toward uncertainty may be misguided. The very heart of risk assessment is the responsibility to use whatever information is at hand or can be generated to produce a number, a range, a probability distributionwhatever expresses best the present state of knowledge about the effects of some hazard in some specified setting. Simply to ignore the uncertainty in any process is almost sure to leave critical parts of the process incompletely examined, and hence to increase the probability of generating a risk estimate that is incorrect, incomplete, or misleading.
For example, past analyses of the uncertainty about the carcinogenic potency of saccharin showed that potency estimates could vary by a factor as large as 1010. However, this example is not representative of the ranges in potency estimates when appropriate models are compared. Potency estimates can vary by a factor of 1010 only if one allows the choice of some models that are generally recognized as having no biological plausibility and only if one uses those models for a very large extrapolation from high to low doses. The judicious application of concepts of plausibility and parsimony can eliminate some clearly inappropriate models and leave a large but perhaps a less daunting range of uncertainties. What is important, in this context of enormous uncertainty, is not the best estimate or even the ends of this 1010-fold range, but the best-informed estimate of the likelihood that the true value is in a region where one rather than or another remedial action (or none) is appropriate. Is there a small chance that the true risk is as large as 10-2, and what would be the risk-management implications of this very small probability of very large harm? Questions such as these are what uncertainty analysis is largely about. Improvements in the understanding of methods for uncertainty analysisas well as advances in toxicology, pharmacokinetics, and exposure assessmentnow allow uncertainty analysis to provide a much more accurate, and perhaps less daunting, picture of what we know and do not know than in the past.
Before discussing the practical applications of uncertainty analysis, it may be best to step back and discuss it as an intellectual endeavor. The problem of uncertainty in risk assessment is large, complex, and nearly intractable, unless it is divided into smaller and more manageable topics. One way to do so, as seen in Table 9-1 (Bogen, 1990a), is to classify sources of uncertainty according to the step in the risk assessment process in which they occur. A more abstract and generalized approach preferred by some scientists is to partition all uncertainties into the three categories of bias, randomness, and true variability. This method
(Table continues on following page.)
of classifying uncertainty is used by some research methodologists, because it provides a complete partition of types of uncertainty, and it might be more productive intellectually: bias is almost entirely a product of study design and performance; randomness a problem of sample size and measurement imprecision; and variability a matter for study by risk assessors but for resolution in risk management (see Chapter 10).
However, a third approach to categorizing uncertainty may be more practical than this scheme, and yet less peculiar to environmental risk assessment than the taxonomy in Table 9-1.
This third approach, a version of which can be found in EPA's new exposure guidelines (EPA, 1992a) and in the general literature on risk assessment uncertainty (Finkel, 1990; Morgan and Henrion, 1990), is adopted here to facilitate communication and understanding in light of present EPA practice. Although the committee makes no formal recommendation on which taxonomy to use, EPA staff might want to consider the alternative classification above (bias,
randomness, and variability) to supplement their current approach in future documents. Our preferred taxonomy consists of:
Problems With EPA's Current Approach To Uncertainty
EPA's current practice on uncertainty is described elsewhere in this report, especially in Chapter 5, as part of the risk-characterization process. Overall, EPA tends at best to take a qualitative approach to uncertainty analysis, and one that emphasizes model uncertainty rather than parameter uncertainties. The uncertainties in the models and the assumptions made are listed (or perhaps described in a narrative way) in each step of the process; these are then presented in a nonquantitative statement to the decision-maker.
Quantitative uncertainty analysis is not well explored at EPA. There is little internal guidance for EPA staff about how to evaluate and express uncertainty. One useful exception is the analysis conducted for the National Emission Standards for Hazardous Air Pollutants (NESHAPS) radionuclides document (described in Chapter 5), which provides a good initial example of how uncertainty analysis could be conducted for the exposure portion of risk assessment. Other EPA efforts, however, have been primarily qualitative, rather than quantitative. When uncertainty is analyzed at EPA, the analysis tends to be piecemeal and highly focused on the sensitivity of the assessment to the accuracy of a few specified assumptions, rather than a full exploration of the process from data collection to final risk assessment, and the results are not used in a systematic fashion to help decision-makers.
The major difficulty with EPA's current approach is that it does not supplant or supplement artificially precise single estimates of risk ("point estimates") with ranges of values or quantitative descriptions of uncertainty, and that it often lacks even qualitative statements of uncertainty. This obscures the uncertainties inherent in risk estimation (Paustenbach, 1989; Finkel, 1990), although the uncertainties themselves do not go away. Risk assessments that do not include sufficient attention to uncertainty are vulnerable to four common and potentially serious pitfalls (adapted from Finkel, 1990):
Perhaps most fundamentally, without uncertainty analysis it can be quite difficult to determine the conservatism of an estimate. In an ideal risk assessment, a complete uncertainty analysis would provide a risk manager with the ability to estimate risk for each person in a given population in both actual and projected scenarios of exposures; it would also estimate the uncertainty in each prediction in quantitative, probabilistic terms. But even a less exhaustive treatment of uncertainty will serve a very important purpose: it can reveal whether the point estimate used to summarize the uncertain risk is "conservative," and if so, to what extent. Although the choice of the "level of conservatism" is a risk-management prerogative, managers might be operating in the dark about how "conservative" these choices are if the uncertainty (and hence the degree to which the risk estimate used may fall above or below the true value) is ignored or assumed, rather than calculated.
Some Alternatives To EPA's Approach
A useful alternative to EPA's current approach is to set as a goal a quantitative assessment of uncertainty. Table 9-2, from Resources for the Future's Center for Risk Management, suggests a sequence of steps that the agency could follow to generate a quantitative uncertainty estimate. To determine the uncertainty in the estimate of risk associated with a source probably requires an understanding of the uncertainty in each of the elements shown in Table 9-3. The following pages describe more fully the development of probabilities and the method of using probabilities as inputs into uncertainty analysis models.
A probability density function (PDF) describes the uncertainty, encompassing objective or subjective probability, or both, over all possible values of risk. When the PDF is presented as a smooth curve, the area under the curve between any two points is the probability that the true value lies between the two points. A cumulative distribution function (CDF), which is the integral or sum of the PDF up to each point, shows the probability that a variable is equal to or less than each of the possible values it can take on. These distributions can some-
times be estimated empirically with statistical techniques that can analyze large sets of data adequately. Sometimes, especially when data are sparse, a normal or lognormal distribution is assumed and its mean and variance (or standard deviation) are estimated from available data. When data are in fact normally distributed over the whole range of possible values, the mean and variance completely characterize the distribution, including the PDF and CDF. Thus, with certain assumptions (such as normality), only a few points might be needed to estimate the whole distribution for a given variable, although more points will both im-
prove the representation of the uncertainty and allow examination of the normality assumption. However, the problem remains that apparently minor deviations in the extreme tails may have major implications for risk assessment (Finkel, 1990). Furthermore, it is important to note that the assumption of normality may be inappropriate.
When data are flawed or not available or when the scientific base is not understood well enough to quantify the probability distributions of all input variables, a surrogate estimate of one or more distributions can be based on analysis of the uncertainty in similar variables in similar situations. For example, one can approximate the uncertainty in the carcinogenic potency of an untested chemical by using the existing frequency distribution of potencies for chemicals already tested (Fiering et al., 1984).
Subjective Probability Distributions
A different method of probability assessment is based on expert opinion. In this method, the beliefs of selected experts are elicited and combined to provide a subjective probability distribution. This procedure can be used to estimate the uncertainty in a parameter (cf., the subjective assessment of the slope of the dose-response relationship for lead in Whitfield and Wallsten, 1989). However, subjective assessments are more often used for a risk assessment component for which the available inference options are logically or reasonably limited to a finite set of identifiable, plausible, and often mutually exclusive alternatives (i.e., for model uncertainty). In such an analysis, alternative scenarios or models are assigned subjective probability weights according to the best available data and scientific judgment; equal weights might be used in the absence of reliable data or theoretical justifications supporting any option over any other. For example, this approach could be used to determine how much the risk assessor should rely on relative surface area vs. body weight in conducting a dose-response assessment. The application of particular sets of subjective probability weights in particular inference contexts could be standardized, codified, and updated as part of EPA's implementation of uncertainty analysis guidelines (see below).
Objective probabilities might seem inherently more accurate than subjective probabilities, but this is not always true. Formal methods (Bayesian statistics)2exist to incorporate objective information into a subjective probability distribution that reflects other matters that might be relevant but difficult to quantify, such as knowledge about chemical structure, expectations of the effects of concurrent exposure (synergy), or the scope of plausible variations in exposure. The chief advantage of an objective probability distribution is, of course, its objectivity; right or wrong, it is less likely to be susceptible to major and perhaps undetectable bias on the part of the analyst; this has palpable benefits in defending a risk assessment and the decisions that follow. A second advantage is that objec-
tive probability distributions are usually far easier to determine. However, there can be no rule that objective probability estimates are always preferred to subjective estimates, or vice versa.
Model Uncertainty: "Unconditional" Versus "Conditional" PDFs
Regardless of whether objective or subjective methods are used to assess them, the distinction between parameter uncertainty and model uncertainty remains pivotal and has implications for implementing improved risk assessments that acknowledge uncertainty. The most important difference between parameter uncertainty and model uncertainty, especially in the context of risk assessment, concerns how to interpret the output of an objective or subjective probability assessment for each.
One can readily construct a probability distribution for risk, exposure, potency, or some other quantity that reflects the probabilities that various values, corresponding to fundamentally different scientific models, represent the true state. Such a depiction, which we will call an "unconditional PDF" because it tries to represent all the uncertainty surrounding the quantity, can be useful for some decisions that agencies must make. In particular, EPA's research offices might be able to make more efficient decisions about where resources should be channeled to study particular risks, if the uncertainty in each risk were presented unconditionally. For example, an unconditional distribution might be reported in this way: "the potency of chemical X is 10-2 per part per million of air (with an uncertainty of a factor of 5 due to parameter uncertainty surrounding this value), but only if the LMS model is correct; if instead the chemical has a threshold, the potency at any ambient concentration is effectively zero." It might even help to assign subjective weights to the current thinking about the probability that each model is correct, especially if research decisions have to be made for many risks.
In addition, some specified regulatory decisionsthose involving the ranking of different risks for the purpose of allowing "tradeoffs" or "offsets"can also suffer if model uncertainty is not quantified. For example, two chemicals (Y and Z) with the same potencyassuming that the LMS model is correctmight involve different degrees of confidence in the veracity of that model assumption. If we judged that chemical Y had a 90%, or even a 20%, chance of acting in a threshold fashion, it might be a mistake to treat it as having the same potency as a chemical Z that is virtually certain to have no threshold and then to allow increased emissions of Z in exchange for greater reductions in Y.
However, unconditional statements of uncertainty can be misleading if managers use them for standard-setting, residual-risk decisions, or risk communication, and especially if others then misinterpret these statements. Consider two situations, involving the same hypothetical chemical, in which the same amount of uncertainty can have different implications, depending on whether it stems
from parameter uncertainty (Situation A) or ignorance about model choice (Situation B). In Situation A, suppose that the uncertainty is due entirely to parameter sampling error in a single available bioassay involving few test animals. If 3 of 30 mice tested in that bioassay developed tumors, then a reasonable central-tendency estimate of the risk to mice at the dose used would be 0.1 (3/30). However, because of sampling error, there is approximately a 5% probability that the true number of tumors might be as low as zero (leading to zero as the lower confidence limit, LCL, of risk) and about a 5% probability that the true number of tumors is 6 or higher (leading to 0.2 (6/30) as the upper confidence limit, UCL, of risk).
In Situation B, suppose instead that the uncertainty is due entirely to ambiguity over which model of biological effect is correct. In this hypothetical situation, there was one bioassay in which 200 of 1,000 mice developed tumors; the risk to mice at the dose would be 0.2 (with essentially no parameter uncertainty due to the very large sample size). But suppose scientists disagree about whether the effect in mice is at all relevant to humans, because of profound metabolic or other differences between the two species, but can agree to assign equal probabilities of 50% to each eventuality. In this case as well, the LCL of the risk to humans would be zero (if the "nonrelevance" theory were correct), and the UCL would be 0.2 (if the "relevance" theory were correct), and it would be tempting to report a "central estimate" of 0.1, corresponding to the expected value of the two possible outcomes, weighted by their assigned probabilities. In either situation A or B, it would be mathematically correct to say the following: "The expected value of the estimate of the number of annual excess cancer deaths nationwide caused by exposure to this substance is 1,000; the LCL of this estimate is zero deaths, and the UCL is 2,000 deaths.''3
We contend that in such cases, which typify the two kinds of uncertainties that risk managers must deal with, it would be a mistake simply to report the confidence limits and expected value in Situation B as one might do more routinely in Situation A, especially if one then used these summary statistics to make a regulatory decision. The risk-communication problem in treating this dichotomous model uncertainty (Situation B) as though it were a continuous probability distribution is that it obscures important information about the scientific controversy that must be resolved. Risk managers and the public should be given the opportunity to understand the sources of the controversy, to appreciate why the subjective weights assigned to each model are at their given values, and to judge for themselves what action is appropriate when the two theories, at least one of which must be incorrect, predict such disparate outcomes.
More critically, the expected value in Situation B might have dramatically different properties as an estimate for decision-making from the one in Situation A. The estimate of 1,000 deaths in Situation B is a contrivance of multiplying subjective weights that corresponds to no possible true value of risk, although this is not itself a fatal flaw; indeed, it is possible that a strategy of deliberately
inviting errors of both overprotection and underprotection at each decision will turn out to be optimal over a long-run set of similar decisions. The more fundamental problem is that any estimate of central tendency does not necessarily lead to optimal decision-making. This would be true even if society had no desire to make conservative risk management decisions.
Simply put, although classical decision theory does encourage the use of expected values that take account of all sources of uncertainty, it is not in the decision-maker's or society's best interest to treat fundamentally different predictions as quantities that can be "averaged" without considering the effects of each prediction on the decision that it leads to. It is possible that a coin-toss gamble between zero deaths and 2,000 deaths would lead a regulator rationally to act as though 1,000 deaths were the certain outcome. But this is only a shorthand description of the actual process of expected-value decision-making, which asks how the decisions that correspond to estimates of zero deaths, 1,000 deaths, and 2,000 deaths perform relative to each other, in light of the possibility that each estimate (and hence each decision) is wrong. In other words, the choice to use an unconditional PDF when there is the kind of model uncertainty shown in situation B is a choice between the possibility of overprotecting or underprotectingif one model is accepted and the other rejectedand the certainty of erring in one direction or the other if the hybrid estimate of 1,000 is constructed. Because in this example the outcomes are numbers that can be manipulated mathematically, it is tempting to report the average, but this would surely be nonsensical if the outcomes were not numerical. If, for example, there were model uncertainty about where on the Gulf Coast a hurricane would hit, it would be sensible to elicit subjective judgment about the probability that a model predicting that the storm would hit in New Orleans was correct, versus the probability that an alternative modelsay, one that predicted that the storm would hit in Tampawas correct. It would also be sensible to assess the expected losses of lives and property if relief workers were irrevocably deployed in one location and the storm hit the other ("expected" losses in the sense of probability times magnitude). It would be foolish, however, to deploy workers irrevocably in Alabama on the grounds that it was the "expected value" of halfway between New Orleans and Tampa under the model uncertaintyand yet this is just the kind of reasoning invited by indiscriminate use of averages and percentiles from distributions dominated by model uncertainty.
Therefore, we recommend that analysts present separate assessments of the parameter uncertainty that remains for each independent choice of the underlying model(s) involved. This admonition is not inconsistent with our view that model uncertainty is important and that the ideal uncertainty analysis should consider and report all important uncertainties; we simply suspect that comprehension and decision-making might suffer if all uncertainties are lumped together indiscriminately. The subjective likelihood that each model (and hence each parameter uncertainty distribution) might be correct should still be elicited and
reported, but primarily to help the decision-maker gauge which depiction of risk and its associated parameter uncertainty is the correct one, and not to construct a single hybrid distribution (except for particular purposes involving priority-setting, resource allocation, etc.). In the hypothetical Situation B, this would mean presenting both models, their predictions, and their subjective weights, rather than simple summary statistics, such as the unconditional mean and UCL.
The existence of default options for model uncertainty (as discussed in the introduction to Part II and in Chapter 6) also places an important curb on the need for and use of unconditional depictions of uncertainty. If, as we recommend, EPA develops explicit principles for choosing and modifying its default models, it will further codify the practice that for every risk assessment, a sequence of "preferred" model choices will exist, with only one model being the prevailing choice at each inference point where scientific controversy exists. Therefore, the "default risk characterization," including uncertainty, will be the uncertainty distribution (embodying the various sources of parameter and scenario uncertainty) that is conditional on the approved choices for dose-response, exposure, uptake, and other models made under EPA's guidelines and principles. For each risk assessment, this PDF, rather than the single point estimate currently in force, should serve as the quantitative-risk input to standard-setting and residual-risk decisions that EPA will make under the act.
Thus, given the current state of the art and the realities of decision-making, model uncertainty should play only a subsidiary role in risk assessment and characterization, although it might be important when decision-makers integrate all the information necessary to make regulatory decisions. We recognize the intellectual and practical reasons for presenting alternative risk estimates and PDFs corresponding to alternative models that are scientifically plausible, but that have not supplanted a default model chosen by EPA. However, we suggest that to create a single risk estimate or PDF out of various different models not only could undermine the entire notion of having default models that can be set aside for sufficient reason, but could lead to misleading and perhaps meaningless hybrid risk estimates. We have presented this discussion of the pitfalls of combining the results of incompatible models to support our view urging caution in applying these techniques in EPA's risk assessment. Such techniques should not be used for calculating unit risk estimates, because of the potential for misinterpretation of the quantitative risk characterization.4However, we encourage risk assessors and risk managers to work closely together to explore the implications of model uncertainty for risk management, and in this context explicit characterization of model uncertainty may be helpful. The characterization of model uncertainty may also be appropriate and useful for risk communication and for setting research priorities.
Finally, an uncertainty analysis that carefully keeps separate the influence of fundamental model uncertainties versus other types of uncertainty can reveal which controversies over model choice are actually important to risk manage-
ment and which are "tempests in teapots." If, as might often be the case, the effect of all parameter uncertainties (and variabilities) is as large as or larger than that contributed by the controversy over model choice, then resolving the controversy over model choice would not be a high priority. In other words, if the "signal" to be discerned by a final answer as to which model or inference option is correct is not larger than the "noise" caused by parameter uncertainty in either (all) model(s), then effort should be focused on data collection to reduce the parameter uncertainties, rather than on basic research to resolve the modeling controversies.
Specific Guidance On Uncertainty Analysis
Generating Probability Distributions
The following examples indicate how probability distributions might be developed in practice and illustrate many of the principles and recommended procedures discussed earlier in the chapter.
A second opportunity, which allows the analyst to draw out some of the model uncertainty in dose-response relationships, stems from the flexibility of the LMS model. Even though this model is often viewed as unduly restrictive (e.g., it does not allow for thresholds or for "superlinear" dose-response relations at low doses), it is inherently flexible enough to account for sublinear dose-response relations (e.g., a quadratic function) at low doses. EPA's point-estimation procedure forces the q1* value to be associated with a linear low-dose model, but there is no reason why EPA could not fit an unrestricted model through all the values on the binomial uncertainty distribution of tumor response, thereby generating a distribution for potency that might include some probability that the true dose-response function is of quadratic or higher order (see, for example, Guess et al., 1977; Finkel, 1988).
Finally, EPA could account for another source of parameter uncertainty if it made use of more than one data set for each carcinogen. Techniques of meta-analysis, more and more frequently used to generate composite point estimates by averaging together the results of different studies (e.g., a second mouse study that might have found 20 leukemic animals out of 50 at the same dose), can perhaps more profitably be used to generate a composite uncertainty distribution. This distribution could be broader than the binomial distribution that would arise from considering the sampling uncertainty in a single study, if the new study contradicted the first, or it could be narrower, if the results of each study were reinforcing (i.e., each result was well within the uncertainty range of the other).
Statistical Analysis of Generated Probabilities
Once the needed subjective and objective probability distributions are estimated for each variable in the risk assessment, the estimates can be combined to determine their impact on the ultimate risk characterization. Joint distributions of input variables are often mathematically intractable, so an analyst must use approximating methods, such as numerical integration or Monte Carlo simulation. Such approximating methods can be made arbitrarily precise by appropriate computational methods. Numerical integration replaces the familiar operations of integral calculus by summarizing the values of the dependent variable(s) on a very fine (multivariate) grid of the independent variables. Monte Carlo methods are similar, but sum the variables calculated at random points on the grid; this is especially advantageous when the number or complexity of the input variables is so large that the costs of evaluating all points on a sufficiently fine grid would be prohibitive. (For example, if each of three variables is examined at 100 points in all possible combination, the grid would require evaluation at 1003 = 1,000,000 points, whereas a Monte Carlo simulation might provide results that are almost as accurate with only 1,000-10,000 randomly selected points.)
Barriers to Quantitative Uncertainty Analysis
The primary barriers to determining objective probabilities are lack of adequate scientific understanding and lack of needed data. Subjective probabilities are also not always available. For example, if the fundamental molecular-biologic bases of some hazards are not well understood, the associated scientific
uncertainties cannot be reasonably characterized. In such a situation, it would be prudent public-health policy to adopt inference options from the conservative end of the spectrum of scientifically plausible available options. Quantitative dose-response assessment, with characterization of the uncertainty in the assessment, could then be conducted conditional on this set of inference options. Such a "conditional risk assessment" could then routinely be combined with an uncertainty analysis for exposure (which might not be subject to fundamental model uncertainty) to yield an estimate of risk and its associated uncertainty.
The committee recognizes the difficulties of using subjective probabilities in regulation. One is that someone would have to provide the probabilities to be used in a regulatory context. A "neutral" expert from within EPA or at a university or research center might not have the knowledge needed to provide a well-informed subjective probability distribution, whereas those who might have the most expertise might have or be perceived to have a conflict of interest, such as persons who work for the regulated source or for a public-interest group that has taken a stand on the matter. Allegations of conflict of interest or lack of knowledge regarding a chemical or issue might damage the credibility of the ultimate product of a subjective assessment. We note, however, that most of the same problems of real or perceived bias pervade EPA's current point-estimation approach.
At bottom, what matters is how risk managers and other end-users of risk assessments interpret the uncertainty in risk analysis. Correct interpretation is often difficult. For example, risks expressed on a logarithmic scale are commonly misinterpreted by assuming that an error of, say, a factor of 10 in one direction balances an error of a factor of 10 in the other. In fact, if a risk is expressed as 10-5 within a factor of 100 uncertainty in either direction, the average risk is approximately 1/2,000, rather than 1/100,000. In some senses, this is a problem of risk communication within the risk-assessment profession, rather than with the public.
Contrary to EPA's statement that the quantitative techniques suggested in this chapter "require definition of the distribution of all input parameters and knowledge of the degree of dependence (e.g., covariance) among parameters," (EPA, 1991f) complete knowledge is not necessary for a Monte Carlo or similar approach to uncertainty analysis. In fact, such a statement is a tautology: it is the uncertainty analysis that tells scientists how their lack of "complete knowledge" affects the confidence they can have in their estimate. Although it is always better to be able to be precise about how uncertain one is, an imprecise statement of uncertainty reflects how uncertain the situation isit is far better to acknowledge this than to respond to the ''lack of complete knowledge" by holding fast to a "magic number" that one knows to be wildly overconfident. Uncer-
tainty analysis simply estimates the logical implications of the assumed model and whatever assumed or empirical inputs the analyst chooses to use.
The difficulty in documenting uncertainty can be reduced by the use of uncertainty guidelines that will provide a structure for how to determine uncertainty for each parameter and for each plausible model. In some cases, objective probabilities are available for use. In others, a subjective consensus about the uncertainty may be based on whatever data are available. Once these decisions are documented, many of the difficulties in determining uncertainty can be alleviated. However, it is important to note that consensus might not be achieved. If a "first-cut" characterization of uncertainty in a specific case is deemed to be inappropriate or superseded by new information, it can be changed by means of such procedures as those outlined in Chapter 12.
The development of uncertainty guidelines is important, because a lack of clear statements as to how to address uncertainty in risk assessment might otherwise lead to continuing inconsistency in the extent to which uncertainty is explicitly considered in assessments done by EPA and other parties, as well as to inconsistencies in how uncertainty is quantified. Developing guidelines to promote consistency in efforts to understand the uncertainty in risk assessment should improve regulatory and public confidence in risk assessment, because guidelines would reduce inappropriate inconsistencies in approach, and where inconsistencies remain, they could help to explain why different federal or state agencies come to different conclusions when they analyze the same data.
Risk Management And Uncertainty Analysis
The most important goal of uncertainty analysis is to improve risk management. Although the process of characterizing the uncertainty in a risk analysis is also subject to debate, it can at a minimum make clear to decision-makers and the public the ramifications of the risk analysis in the context of other public decisions. Uncertainty analysis also allows society to evaluate judgments made by experts when they disagree, an especially important attribute in a democratic society. Furthermore, because problems are not always resolved and analyses often need to be repeated, identification and characterization of the uncertainties can make the repetition easier.
Single Estimates of Risk
Once EPA succeeds in supplanting single point estimates with quantitative descriptions of uncertainty, its risk assessors will still need to summarize these distributions for risk managers (who will continue to use numerical estimates of risk as inputs to decision-making and risk communication). It is therefore crucial to understand that uncertainty analysis is not about replacing "risk numbers" with risk distributions or any other less transparent method; it is about con-
sciously selecting the appropriate numerical estimate(s) from out of an understanding of the uncertainty.
Regardless of whether the applicable statute requires the manager to balance uncertain benefits and costs or to determine what level of risk is "acceptable," a bottom-line summary of the risk is a very important input, as it is critical to judging how confident the decision-maker can be that benefits exceed costs, that the residual risk is indeed "acceptable," or whatever other judgments must be made. Such summaries should include at least three types of information: (1) a fractile-based summary statistic, such as the median (the 50th percentile) or a 95th-percentile upper confidence limit, which denotes the probability that the uncertain quantity will fall an unspecified distance above or below some associated value; (2) an estimate of the mean and variance of the distribution, which along with the fractile-based statistic provides crucial information about how the probabilities and the absolute magnitudes of errors interrelate; and (3) a statement of the potential for errors and biases in these estimates of fractiles, mean, and variance, which can stem from ambiguity about the underlying models, approximations introduced to fit the distribution to a standard mathematical form, or both.
One important issue related to uncertainty is the extent to which a risk assessment that generates a point estimate, rather than a range of plausible values, is likely to be too "conservative" (that is, to excessively exaggerate the plausible magnitude of harm that might result from specified environmental exposures). As the two case studies that include uncertainty analysis (Appendixes F and G) illustrate, these investigations can show whether "conservatism" is in fact a problem, and if so, to what extent. Interestingly, the two studies reach opposite conclusions about "conservatism'' in their specific risk-assessment situations; perhaps this suggests that facile conclusions about the "conservatism" of risk assessment in general might be off the mark. On the one hand, the study in Appendix G claims that EPA's estimate of MEI risk (approximately 10-1) is in fact quite "conservative," given that the study calculates a "reasonable worst-case risk" to be only about 0.0015.6However, we note that this study essentially compared different and incompatible models for the cancer potency of butadiene, so it is impossible to discern what percentile of this unconditional uncertainty distribution any estimate might be assigned (see the discussion of model uncertainty above). On the other hand, the Monte Carlo analysis of parameter uncertainty in exposure and potency in Appendix F claims that EPA's point estimate of risk from the coal-fired power plant was only at the 83rd percentile of the relevant uncertainty distribution. In other words, a standard "conservative" estimate of risk (the 95th percentile) exceeds EPA's value, in this case by a factor of 2.5. It also appears from Figure 5-7 in Appendix F that there is about a 1% chance that EPA's estimate is too low by more than a factor of 10. Note that both case studies (Appendixes F and G) fail to distinguish sources of uncertainty from sources of interindividual variability, so the corresponding "uncertainty" distributions obtained cannot be used to properly characterize uncertainty either
in predicted incidence or in predicted risk to some particular (e.g., average, highly exposed, or high-risk) individual (see Chapter 11 and Appendix I-3).
As discussed above, access to the entire PDF allows the decision-maker to assess the amount of "conservatism" implicit in any estimate chosen from the distribution. In cases where the risk manager asks the analyst to summarize the PDF via one or more summary statistics, the committee suggests that EPA might consider a particular kind of point estimate to summarize uncertain risks, in light of the two distinct kinds of "conservatism" discussed in Appendix N-1 (the "level of conservatism,'' the relative percentile at which the point estimate of risk is located, and the "amount of conservatism," the absolute difference between the point estimate and the mean). Although the specific choice of this estimate should be left to EPA risk managers, and may also need to be flexible enough to accommodate case-specific circumstances, estimates do exist that can account for both the percentile and the relationship to the mean in one single number. For example, EPA could choose to summarize uncertain risks for reporting the mean of the upper five percent of the distribution. It is a mathematical truism that (for right-skewed distributions commonly encountered in risk assessment) the larger the uncertainty, the greater the chance that the mean may exceed any arbitrary percentile of the distribution (see Table 9-4). Thus, the mean of the upper five percent is by definition "conservative" both with respect to the overall mean of the distribution and to its 95th percentile, whereas the 95th percentile may not be a "conservative" estimate of the mean. In most situations, the amount of "conservatism" inherent in this new estimator will not be as extreme as it would be if a very high percentile (e.g. the 99.9th) was chosen without reference to the mean.
Thus, the issue of uncertainty subsumes the issue of conservatism in point estimates. Point estimates chosen without regard to uncertainty provide only the barest beginnings of the story in risk assessment. Excessive or insufficient conservatism can arise out of inattention to uncertainty, rather than out of a particular way of responding to uncertainty. Actions taken solely to reduce or eliminate potential conservatism will not reduce and might increase the problem of excessive reliance on point estimates.
In summary, EPA's position on the issue of uncertainty analysis (as represented in the Superfund document) seems plausible at first glance, but it might be somewhat muddled. If we know that "all risk numbers are only good to within a factor of 10," why do any analyses? The reason is that both the variance and the conservatism (if any) are case-specific and can rarely be estimated with adequate precision until an honest attempt at uncertainty analysis is made.
Inadequate scientific and technical communication about risk is sometimes a source of error and uncertainty, and guidance to risk assessors about what to
include in a risk analysis should include guidance about how to present it. The risk assessor must strive to be understood (as well as to be accurate and complete), just as risk managers and other users must make themselves understood when they apply concepts that are sometimes difficult. This source of uncertainty in interprofessional communication seems to be almost untouched by EPA or any other official body (AIHC, 1992).
Comparison, Ranking, And Harmonization Of Risk Assessments
As discussed in Chapter 6, EPA makes no attempt to apply a single set of methods to assess and compare default and alternative risk estimates with respect to parameter uncertainty. The same deficiency occurs in the comparison of risk estimates. When EPA ranks risks, it usually compares point estimates without considering the different uncertainties in each estimate. Even for less important regulatory decisions (when the financial and public-health impacts are deemed to be small), EPA should at least make sure that the point estimates of risk being compared are of the same type (e.g., that a 95% upper confidence bound for one risk is not compared with a median value for some other risk) and that each assessment has an informative (although perhaps sometimes brief) analysis of the uncertainty. For more important regulatory decisions, EPA should estimate the uncertainty in the ratio of the two risks and explicitly consider the probabilities and consequences of setting incorrect priorities. For any decisions involving risk-trading or priority-setting (e.g., for resource allocation or "offsets"), EPA should take into account information on the uncertainty in the quantities being ranked so as to ensure that such trades do not increase expected risk and that such priorities are directed at minimizing expected risk. When one or both risks are highly uncertain, EPA should also consider the probability and consequences of greatly erring in trading one risk for another, because in such cases one can lower the risk on average and yet introduce a small chance of greatly increasing risk.
Finally, EPA sometimes attempts to "harmonize" risk-assessment procedures between itself and other agencies, or among its own programs, by agreeing on a single common model assumption, even though the assumption chosen might have little more scientific plausibility than alternatives (e.g., replacing FDA's body-weight assumption and EPA's surface-area assumption with body weight to the 0.75 power). Such actions do not clarify or reduce the uncertainties in risk assessment. Rather than "harmonizing" risk assessments by picking one assumption over others when several assumptions are plausible and none of the assumptions is clearly preferable, EPA should use the preferred models for risk calculation and characterization, but present the results of the alternative models (with their associated parameter uncertainties) to further inform decision-makers and the public. However, ''harmonization" does serve an important
purpose in the context of uncertainty analysisit will help, rather than hinder, risk assessment if agencies cooperate to choose and validate a common set of uncertainty distributions (e.g., a standard PDF for the uncertain exponent in the "body weight to the X power" equation or a standard method for developing a PDF from a set of bioassay data).
Findings And Recommendations
The committee strongly supports the inclusion of uncertainty analysis in risk assessments despite the potential difficulties and costs involved. Even for lower-tier risk assessments, the inherent problems of uncertainty need to be made explicit through an analysis (although perhaps brief) of whatever data are available, perhaps with a statement about whether further uncertainty analysis is justified. The committee believes that a more explicit treatment of uncertainty is critical to the credibility of risk assessments and to their utility in risk management.
The committee's findings and recommendations are summarized briefly below.
Single Point Estimates and Uncertainty
EPA often reports only a single point estimate of risk as a final output. In the past, EPA has only qualitatively acknowledged the uncertainty in its estimates, generally by referring to its risk estimates as "plausible upper bounds" with a plausible lower bound implied by the boilerplate statement that "the number could be as low as zero." In light of the inability to discern how "conservative" an estimate might be unless one does an uncertainty analysis, both statements might be misleading or untrue in particular cases.
EPA committed itself in a 1992 internal memorandum (see Appendix B) to doing some kind of uncertainty analysis in the future, but the memorandum does not define when or how such analysis might be done. In addition, it does not distinguish between the different types of uncertainty or provide specific examples. Thus, it provides only the first, critical step toward uncertainty analysis.
Comparison of Risk Estimates
EPA makes no attempt to apply a consistent method to assess and compare default and alternative risk estimates with respect to parameter uncertainty. Presentations of numerical values in an incomplete form lead to inappropriate and possibly misleading comparisons among risk estimates.
Harmonization of Risk Assessment Methods
EPA sometimes attempts to "harmonize" risk-assessment procedures between itself and other agencies or among its own programs by agreeing on a single common model assumption, even though the assumption chosen might have little more scientific plausibility than alternatives, (e.g., replacing FDA's body-weight assumption and EPA's surface-area assumption with body weight to the 0.75 power). Such actions do not clarify or reduce the uncertainties in risk assessment.
Ranking of Risk
When EPA ranks risks, it usually compares point estimates without considering the different uncertainties in each estimate.
1. Although variability in a risk-assessment parameter across different individuals is itself a type of uncertainty and is the subject of the following chapter, it is possible that new parameters might be incorporated into a risk assessment to model that variability (e.g., a parameter for the standard deviation of the amount of air that a random person breathes each day) and that those parameters themselves might be uncertain (see "uncertainty and variability" section in Chapter 11).
2. It is important to note that the distributions resulting from Bayesian models include various subjective judgments about models, data sets, etc. These are expressed as probability distributions but the probabilities should not be interpreted as probabilities of adverse effect but, rather, as expressions of strengths of conviction as to what models, data sets, etc. might be relevant to assessing risks of adverse effect. This is an important distinction which should be kept in mind when interpreting and using such distributions in risk management as a quantitative way of expressing uncertainty.
3. Assume that to convert from risk to the test animals to the predicted number of deaths in the human population, one must multiply by 10,000. Perhaps the laboratory dose is 10,000 times larger than the dose to humans, but 100 million humans are exposed. Thus, for example,
4. Note that characterizing risks considering only the parameter uncertainty under the preferred set of models might not be as restrictive as it appears at first glance, in that some of the model choices can be safely recast as parameter uncertainties. For example, the choice of a scaling factor between rodents and humans need not be classified as a model choice between body weight and surface area that calls for two separate "conditional PDFs," but instead can be treated as an uncertain parameter in the equation Rhuman Rrodent BWa, where a might plausibly vary between 0.5 and 1.0 (see our discussion in Chapter 11). The only constraint in this case is that the scaling model is some power function of BW, the ratio of body weights.
5. It is not always clear what percent of the distribution someone is referring to by "correct to within a factor of X." If instead of assuming that the person means with 100% confidence, we assumed that the person means 98% confidence, then the factor of X would cover two standard deviations on either side of the median, so one geometric standard deviation would be equal to X.
6. We arrive at this figure of 0.0015, or 1.5 × 10-3, by noting that the "base case" for fenceline risk (Table 3-1 in Appendix G) is 5 × 10-4 and that "worst case estimates were two to three times higher than base case estimates."