The Case for "Plausible Conservatism" in Choosing and Altering Defaults
This Appendix was written by one member of our committee, who was asked to represent the viewpoint of those members of the committee who believe that EPA should choose and refine its default assumptions by continually evaluating them against two equally important standards: whether the assumption is scientifically plausible, and whether it is "conservative" and thus tends to safeguard public health in the face of scientific uncertainty. Indeed, these three themes of plausibility, uncertainty, and conservatism form most of the framework for the last six chapters of the CAPRA report, as reflected in the "cross-cutting" chapters on model evaluation, uncertainty and variability, and on implementing an iterative risk assessment/management strategy. The particular way these themes should come together in the selection and modification of default assumptions is controversial; hence, the remainder of this appendix is organized into five parts: (1) a general discussion of what "conservatism" does and does not entail; (2) an enumeration of reasons why conservatism is appropriately part of the rationale for choosing and departing from defaults; (3) the specific plan proposed for EPA's consideration;1 (4) a side-by-side analysis of this proposal against the competing principle of "meximum use of scientific information" (see Appendix N-2 following this paper); and (5) general conclusions.
1Although I will discuss and evaluate the general issue of conservatism in detail before I present our specific recommendations, I urge readers to consider whether the proposal detailed in this third section bears any resemblance to the kind of "conservatism for conservatism's sake" that critics decry.
What Is "Conservatism"?
The most controversial aspect of this proposal within the full committee was its emphasis on "conservatism" as onenot the onlyorganizing principle to judgenot to prejudgethe merits of defaults and their alternatives. Supporters of this proposal are well aware that there are strengths and weaknesses of the conservative orientation that make it one of the most hotly-contested topics in all of environmental policy analysis, but also believe that few topics have been surrounded by as much confusion and misinformation. Some observers of risk assessment appear to be convinced that EPA and other agencies have so overemphasized the principle of conservatism as to make most risk estimates alarmingly false and meaningless; others, including at least one member of this committee, have instead suggested that if anything, the claims of these critics tend to be more reflexive, undocumented by evidence, and exaggerated than are EPA's risk estimates themselves (Finkel, 1989). It is clear that partisans cannot agree on either the descriptive matter of whether risk assessment is too conservative or on the normative matter of how much conservatism (perhaps any at all) would constitute an excess thereof. However, at least some of the intensity marking this debate is due to a variety of misimpressions about what conservatism is and what its ramifications are. Before laying out the proposal, therefore, some of these definitional matters will be discussed.
First, a useful definition of conservatism should help clarify it in the face of the disparate charges leveled against it. Conservatism is, foremost, one of several ways to generate risk estimates that allow risk management decisions to be made under conditions of uncertainty and variability. Simply put, a risk assessment policy that ignored or rejected conservatism would strive to always represent risks by their "true values" irrespective of uncertainty (or variability), whereas any attempt to consider adding (or removing) some measure of conservatism would lead the assessor to confront the uncertainty. Incorporating "conservatism" merely means that from out of the uncertainty and/or variability, the assessor would deliberately choose an estimate that he believes is more likely to overestimate than to underestimate the risk.
Rationality in managing risks (as in any endeavor of private or social decision making) involves the attempt to maximize the benefit derived from choice under specific conditions in the world. If we do not know those conditions (uncertainty) or do not know to whom these conditions apply (human interindividual variability), we have to make the choice that would be optimal for a particular set of conditions and essentially hope for the best. If the true risk we are trying to manage is larger or smaller than we think it is (or if there are individuals for whom this is so) then our choice may be flawed, but we still have to choose. Unlike the search for scientific truth, where the "correct" action in the face of uncertainty is to reserve one's judgment, in managing risks decisions are inevitable, since reserving judgment is exactly equivalent to making the judg-
ment that the status quo represents a desirable balance of economic costs expended (if any) and health risks remaining (if any). It is therefore vital that the risk assessment process handle uncertainties in a predictable way that is scientifically defensible, consistent with the Agency's statutory and public missions, and responsive to the needs of decision makers. Conservatism is a specific response to uncertainty that favors one type of error (overestimation) over its converse, but (especially if EPA follows the detailed prescriptions here) the fact that it admits that either type of error is possible is more important than the precise calculus it may use to balance those errors.
It is also crucial to understand what this asymmetry in favor of overestimation does and does not mean. Conservatism is not about valuing human lives above the money spent to comply with risk management decisions. Instead, it acknowledges that if there was no uncertainty in risk, society could "optimally" decide to spend a dollar or a billion dollars to save each life involvedconservatism is silent about this judgment. Assuming that society decides how it wishes to balance lives and dollars, conservatism only affects the decision at the margin, by deliberately preferring, from among the inevitable errors that uncertainty creates, to favor those errors which lead to relatively more dollars spent for the lives saved than those which lead to relatively fewer lives saved for the dollars spent.
Some would call this an orientation disposed to being "better safe than sorry" or a tendency towards "prudence," characterizations we do not dispute or shrink from. It is simply a matter of "good science" to admit that the true value of risk is surrounded by uncertainty, and that as a consequence, errors of overestimation or underestimation can still occur for whatever value of risk one chooses as the basis for risk management. Much detail about conservatism follows in this appendix of the report, but the essence of the disagreement between supporters of this proposal and supporters of the alternative position is simple; the former group believes that it is both prudent and scientifically justified to make reasonable attempts to favor errors of overestimation over those of underestimation. More importantly, it believes that not to do so would be both imprudent and scientifically questionable. This is no mere tautology, but encapsulates the disagreement with others who would argue that to eschew prudence is to advocate something "value-neutral" (and hence a morally superior position for scientists to espouse) and something more ''scientific."
The controversies over conservatism are heightened by ambiguous definitions and uses of the term. The following section explains three dichotomies about the precise possible meanings of conservatism, in order to clarify some of the objections to it, and to foreshadow some of the features of this proposal for a principle of "plausible conservatism":
2For two reasons, we believe it is logically consistent to espouse a principle of "plausible conservatism" with regard to model uncertainty and not explicitly recommend the same response to variability: (1) as a pragmatic matter, we believe scientists have more that they alone can contribute to a discussion of how to choose among competing scientific theories than they have to contribute to a discussion of what kind of individuals EPA should try to protect; and (2) we believe the public has more clearly expressed a preference for "erring on the side of safety" when the truth is unknown than it has regarding how much protection to extend to the extremes of variability distributions.
Inherent Advantages of "Plausible Conservatism"
It is perplexing to some members of this committee (and to many in the general populace) that the presumption that society should approach uncertain risks with a desire to be "better safe than sorry" has engendered so much skepticism. After all, perhaps it should instead be incumbent upon opponents of conservative defaults to defend their position that EPA ought to ignore or dilute plausible scientific theories that, if true, would mean that risks need to be addressed concertedly. That view, whatever its intellectual merits, seems at the outset not to give the public what it has consistently called for (explicitly in legislation and implicitly in the general conduct of professions ranging from structural engineering to medicine to diplomacy): namely, the attempt to guard against major errors that threaten health and safety. But the proposal for risk assessment based on "plausible conservatism" came about largely because of the wide variety of other factors supporting it, whether viewed through the lenses of logic, mathematics, procedure, or political economy. The following brief accounting of some of the virtues of a conservative orientation may seem somewhat superfluous, especially given the statements of earlier NRC committees on the topic.3 However, this committee's decision not to endorse "plausible conservatism" by consensus has prompted this more thorough enumeration of some factors some members had though were uncontroversial:
A. "Plausible Conservatism" Reflects the Public's Preference Between Errors Resulting in Unnecessary Health Risks and those Resulting in Unnecessary Economic Expenditures.
An examination of the two kinds of errors uncertainty in risk can cause supports the conclusion that society has not been indifferent between them. One type of error (caused by the overestimation of risk) leads to more resources invested than society would optimally invest if it knew the magnitude of the risk precisely. The other type (caused by underestimation of risk) leads to more lives lost (or more people subjected to unacceptably high individual risks) than society would tolerate if there was no uncertainty in risk. Whether the aversion to the latter type of error is due to the greater irreversibility of its consequences
3For example, consider this recent statement of the BEST Committee on Environmental Epidemiology (NRC, 1991): "public health policy requires that decisions be made despite incomplete evidence, with the aim of protecting public health in the future."
compared to the former,4 the importance of regret (Bell, 1982) in most individual and social decision-making,5 or other factors is beyond our capacity to answer. What matters is, do Congress and the public view risk management as a social endeavor that should strive both for scientific truth and for the prudent avoidance of unnecessary public health risks, and therefore do not view risk assessment as purely an exercise in coming as close to the "right answer" as possible? If this is so, then the competing proposal offered in Appendix N-2 espouses an unscientific value judgment, and one that also is unresponsive to social realities.
A counter-example may be illustrative here. In its recent indictment of conservatism in Superfund risk assessment, an industry coalition drew an extended analogy to link EPA's risk estimates with inflated predictions of the amount of time it would take someone to take a taxi ride to Dulles Airport (Hazardous Waste Cleanup Project, 1993). But this particular personal decision seems to be another prime example of where individuals and society would clearly prefer conservative estimates. As demonstrated below, any level of conservatism (positive, zero, or negative) corresponds to some underlying attitude towards errors of overestimation and underestimation. In this case, a conservative estimate of travel time simply means that the traveller regards each minute she arrives at the airport after the plane leaves as more costly to her than each minute of extra waiting time caused by arriving before the plane leaves. It is hardly surprising to conclude that a rational person would not be indifferent, but would rather be 10 minutes early than 10 minutes late to catch a plane. If, hypothetically, someone advising the traveller told her he wasn't sure whether the airline she chose would have a single ticket agent (and a 20-minute long line) or a dozen agents (and no line), it seems hard to believe that she would ask for a "best estimate" between zero and 20 minutes and allow only that much time (and even less likely she would assume that the long line simply couldn't happen). As long as the more "conservative" scenario was plausible, it would tend to dominate her thinking, simply because the decision problem is not about arriving at exactly the right moment, but about balancing the costs of a very early arrival against the qualitatively different costs of even a slightly late arrival. Again, reasonable people may differ widely about how large either asymmetry should be, but supporters of "plausible conservatism" are hard pressed to imagine not
4It is possible that profligacy in economic resources invested may also lead to adverse health consequences (MacRae, 1992). However, this "richer is safer" theory is based on controversial data (Graham et al., 1993), and at most offsets in an indirect way the more direct and irreversible consequences of underregulation in the eyes of the public.
5Anticipation of regret tends to make people choose courses of action that are less likely to leave them with the knowledge that they failed to take another available action that would have been much less damaging.
admitting that some adjustment to make catching the plane more likely, or reducing the risk more probable, aligns with the expressed desires of the public.
B. Conservative Defaults Help Increase the Chances that Risk Estimates Will Not be "Anti-Conservative."
There are two different mathematical aspects of risk assessment under uncertainty that mitigate in favor of a conservative approach to selection of default options. Both factors tend to make risk estimates generated from conservative models less conservative than they might appear at first glance, and thus tip the balance further in favor of such models as minimally necessary to support prudent decisions.
Let us assume at the outset that the assessor and decision-maker both desire that at the very least, risk estimates should not be "anti-conservative," that is, not underestimate the mean (arithmetic average) of the true but unknown risk. The mean, after all, is the minimum estimator that a so called "risk-neutral" decision-maker (e.g., a person who is not actually trying to catch a plane, but who stands to win a wager if she arrives at the airport either just before or just after the plane leaves) would need in order to balance errors of overestimation and underestimation. In this regard, there exists a basic mathematical property of uncertain quantities that introduces an asymmetry. For non-negative quantities (such as exposures, potencies, or risks), the uncertainties are generally distributed in such a way that larger uncertainty increases the arithmetic mean, due to the disproportionate influence of the right-hand tail. For example, if the median (50th percentile) of such an uncertainty distribution was X, but the assessor believed that the standard error of that estimate was a factor of 10 in either direction, then the 90th percentile (19X) and the arithmetic mean (14X) would be nearly identical; if the uncertainty was a factor of 25 in either direction, the mean and the 95th percentile would be virtually identical (see Table 9-4). Some of the most familiar examples of the need to impose a moderate "level of conservatism" in order not to underestimate the mean come from empirical data that exhibit variability. For example, it is unlikely, even in a state that includes areas of high radon concentration, that a randomly selected home would have a radon concentration exceeding approximately 10 picocuries/liter. Yet the mean concentration for all homes in that state might equal or even exceed 10 because of the influence on the mean of the small number of homes with much higher levels.6
6This mathematical truism that the more uncertainty, the greater the level of conservatism required not to underestimate the mean, seriously undermines one of the major claims made by those who accuse EPA of "cascading conservatism." If each of a series of uncertain quantities is distributed in such a way that a reasonably conservative estimator (say, the 95th percentile) approximates or even falls below the mean of that quantity, then the more steps in the cascade the less conservative the output becomes with respect to the correct risk-neutral estimator.
The other basic mathematical advantage of introducing some conservatism into the scientific inferences that are made is the expectation that there may be other factors unknown to the assessor which would tend to increase uncertainty. This becomes a stronger argument for conservatism if one believes that more of these unknown influences would tend to increase than to decrease the true risk. Although it seems logical that factors science has not yet accounted for (such as unsuspected exposure pathways, additional mechanisms of toxicity, synergies among exposures, or variations in human susceptibility to carcinogenesis) would tend to add to the number or severity of pathways leading to exposure and/or greater risk, it is possible that "surprises" could also reveal humans to be more resistant to pollutants or less exposed than traditional analyses predict.
C. "Plausible Conservatism" Fulfills the Statutory Mandate under which EPA Operates in the Air Toxics (and many other) Programs.
The policy of preventive action in the face of scientific uncertainty has long been part of the Clean Air Act, as well as most of the other enabling legislation of EPA. Two key directives run through many of the sections of the Clean Air Act in this regard. First, various sections of the Act direct EPA to consider not merely substances that have been shown to cause harm, but those that "may reasonably be anticipated" to cause harm. As the D.C. Circuit court stated in its 1976 decision in Ethyl Corp. v. EPA, "commonly, reasonable medical concerns and theory long precede certainty. Yet the statutes and common sense demand regulatory action to prevent harm, even if the regulator is less than certain that harm is otherwise inevitable." Similarly, the Act has long required standards for air pollutants to provide "an ample margin of safety to protect public health." The leading case on the interpretation of Section 112, the 1987 case of Natural Resources Defense Council v. EPA, declared that
In determining what is an "ample margin" the Administrator may, and perhaps must, take into account the inherent limitations of risk assessment and the limited scientific knowledge of the effects of exposure to carcinogens at various levels, and may therefore decide to set the level below that previously determined to be "safe."…[B]y its nature the finding of risk is uncertain and the Administrator must use his discretion to meet the statutory mandate.
Again, support for the idea that "plausible conservatism" is the most rational approach for EPA to take is not necessarily based on a reading of the various statutes. After all, it is possible that the statutes may be changed in the near or far future. However, it seems central to EPA's mission that the Agency consider whether it is necessary to prevent or minimize adverse events, even events of low probability. Therefore, the Agency inevitably will find it necessary to use risk assessment techniques that are sensitive enough to reflect the risks of those events. At a minimum, its techniques must explore the nature of possible extreme outcomes, as a prelude to science-policy choices as to whether to factor those extremes into its risk characterizations. In essence, conservatism in the
choice of default options is a way of making risk assessment a sensitive enough device to allow risk managers to decide to what extent they can fulfill the intent of the enabling legislation. For this reason, members of the committee advanced the proposition, which proved eventually to be controversial within the committee, that "plausible conservatism" gives decisionmakers some of the information they need to make precautionary risk management decisions.
D. It Respects the Voice of Science, Not Only the Rights of Individual Scientists.
By declaring that defaults would be chosen to be both scientifically supportable and health-protective, and that scientists would have to examine alternative models by these two criteria, EPA could help ensure that science will assume the leading role in defining evolving risk assessment methodology. Some have asserted that it shows disrespect for science to posit any standard for departure from defaults other than one that simply requires EPA to adopt "new and better science at the earliest possible time." But surely there is a generally inverse relationship between the amount of knowledgeable controversy over a new theory and the likely "staying power" and reliability of such "new science." At the extremes, EPA could either change its defaults over and over again with each new individual voice it hears complaining that a default is passé, or never change a default until absolute scientific unanimity had congealed and remained unshakable for some number of years. The "persuasive evidence" standard proposed here (see below) clearly falls between these two extremes. It reflects our belief that standards which rely more on scientific consensus than on the rights of individual scientists dissatisfied with the current situation are in fact more respectful of science as an institution.
The only cost to a standard that values scientific consensus over "heed the loudest voice you hear" is that advocates of "new science" need to persuade the mainstream of their colleagues that new is indeed better. This standard is in fact a bargain for scientists, because it buys credibility in the public arena and some degree of immunity against being undercut by the next new theory that comes along. And, in addition to this give-and-take principle that elevates respect for scientific decisions by valuing the concord of scientists, advocates of "new science" must appreciate that the twin standards of plausibility and conservatism in fact remove a major source of arbitrariness in EPA's science-policy apparatus. If the Agency merely held up its defaults as unconnected "rules we live by" and required scientists to prove them "wrong,'' then the charge of bureaucracy-over-science would have merit. But this recommendation for EPA to reaffirm or rethink the set of defaults as "the most conservative of the plausible spectrum" sends a clear signal to the scientific community that each default only has merit insofar as it embodies those twin concepts, and gives scientists two clear bases for challenging and improving the set of inference assumptions.
E. It Generates Routinely those Risk Estimates Essential to Various EPA Functions.
The committee was also unable to reach agreement on the details of what roles "nonconservative" estimates should play in standard setting, priority setting, and risk communication, although the committee's recommendations in Chapter 9 reflect its belief that such estimates have utility in all of these arenas. However, no one has suggested that "nonconservative" estimates should drive out estimates produced via "plausible conservatism," but rather that they should supplement them. Indeed, the committee agrees that conservative estimates must be calculated for at least two important risk assessment purposes: (1) the foundation of the iterative system of risk assessment the committee has proposed is the screening-level analysis. Such analyses are solely intended to obviate the need for detailed assessment of risks that can to a high degree of confidence be deemed acceptable or de minimis. By definition, therefore, screening analyses must be conservative enough to eliminate the possibility that an exposure that indeed might pose some danger to health or welfare will fail to receive full scrutiny; and (2) even if EPA decided to use central-tendency risk estimates for standard-setting or other purposes, it would first have to explore the conservative end of the spectrum in order to have any clear idea where the expected value of the uncertain risk (as discussed above, the correct central-tendency estimate for a risk-neutral decision) actually falls. Because of the sensitivity of the expected value of a distribution to its right-hand tail, one cannot simply arrive at this midpoint in one step.7
For both reasons, risk assessment cannot proceed without the attempt to generate a conservative estimate, even if that estimate is only an input to a subsequent process. Therefore, the only argument among us is whether to modify or discard such estimates for some purposes other than screening or calculation of central tendencies, not whether they should be generated at all. Either way, a set of default assumptions embodying "plausible conservatism" must play some role.
F. It Promotes an Orderly, Timely Process that Realistically Structures the Correct Incentives for Research.
Many observers of risk assessment have pointed out that the scientific goal of "getting the right answer" for each risk assessment question conflicts directly with the regulatory and public policy goals of timeliness and striking a balance
7See Table 9-4 for various calculations showing how if the uncertainty is distributed continuously, the arithmetic mean can be very sensitive to the conservative percentiles. If instead, the uncertainty is dichotomous (say, the risk was either Y or zero depending on which of two models was correct), the expected value would depend completely on the value of Y and the subjective probability assigned to it. In either case, the upper bound must be estimated before the mean can be.
between limited resources available for research and those available for environmental protection itself. The committee agreed that too much emphasis on fine-tuning the science can lead to untoward delay; our real disagreement again comes down to the question of how to initiate and structure the process of modifying science-based inferences. As discussed in the preceding paragraph, one advantage of starting from a conservative stance and declaring the true central tendency as the ultimate goal is that it arguably is easier to move towards this desired midpoint (given the influence of the conservative possibility on it) than to start by trying to guess where that midpoint might be. There is also a procedural advantage to a conservative starting point, however, which stems from a frank assessment of the resources and natural motivations available to different scientific institutions. Some of us believe that an evaluation of the relative effort over the last decade or so devoted to positing and studying less conservative risk models (e.g., threshold and sublinear extrapolation models, cases where humans are less sensitive than test animals) versus the converse (e.g., synergies among exposures, cases where negative rodent tests might not spell safety for humans) reveals an asymmetry in research orientation, with the former type of research garnering much more resources and attention than the latter. This orientation is not necessarily either pernicious or unscientific, but EPA should make use of it rather than pretend it does not exist. The best way for the Agency to do so, we believe, is to begin with a stance of "plausible conservatism" and establish explicit procedures, based on peer review and full participation, that demonstrate convincingly that the Agency understands it must be receptive to new scientific information. This takes advantage of the tendency to preferentially test less conservative theories. Moreover, EPA must communicate to the public that a general tendency for risk estimates to become less conservative (in absolute terms) over time is not evidence of EPA bias, but of an open and mutual covenant between the Agency and the scientific community searching for better models.
G. It Reflects EPA's Fundamental Public Mission as a Scientific/Regulatory Agency.
As discussed below, advocates of "best estimates" frequently fail to consider how difficult, error-prone, and value-laden the search for such desirable end points can be. Since CAPRA has been asked to suggest improvements in the methodology by which EPA assesses risks from exposures to hazardous air pollutants, it is also incumbent upon us at least to remark on the purpose of such risk estimates. Part of our disagreement on the entire set of defaults issues arises because there are two purposes for risk estimates: to accurately describe the true risks, if possible, and to identify situations where risks might be worth reducing. Other government agencies also have to serve the two masters of truth and decision, yet their use of analysis does not seem to arouse so much controversy. Military intelligence is an empirical craft that resembles risk assessment in its reliance on data and judgment, but there have been few exhortations that the
Department of Defense (DOD) should develop and rely on "best estimates" of the probability of aggression, rather than on accepted estimates of how high those probabilities might reasonably be. There is room for vigorous descriptive disagreement about the extent of conservatism in DOD predictions, and for normative argument about the propriety thereof, but these are questions of degree that do not imply DOD should abandon or downplay its public mission in favor of its "scientific" mission.8
Specific Recommendations To Implement This Principle
Members of the committee who advocate that EPA should choose and modify its defaults with reference to the principle of "plausible conservatism" have in mind a very specific process to implement this principle, in order to accentuate its usefulness along the criteria discussed in the introduction to Part II of the report, and to minimize its potential drawbacks. In light of the controversy these four recommended procedures engendered within the committee, this section will emphasize what our vision of "plausible conservatism" does not involve or sanction, even though these features apparently were not sufficient to stanch the opposition to the proposal.
Step 1 In each instance within the emissions and exposure assessment or the toxicity assessment phase of risk assessment where two or more fundamentally different scientific (i.e., biological, physical, statistical, or mathematical) assumptions or models have been advanced to bridge a basic gap in our knowledge, EPA should first determine which of these models are deemed "plausible" by knowledgeable scientists. As an example, let us assume that scientists who believe benign rodent tumors can be surrogates for malignant tumors would admit that the opposite conclusion is also plausible, and vice versa. Then, from this "plausible set," EPA should adopt (or should reaffirm) as a generic default that model or assumption which tends to yield risk estimates more conservative than the other plausible choices. For example, EPA's existing statement (III.A.2 from the 1986 cancer guidelines) that chemicals may be radiomimetic at low doses, and thus that the linearized multistage model (LMS) is the appropriate default for exposure-response extrapolation, is not a statement of scientific fact, but is the preferred science-policy choice, for three reasons: (1) the scientific conclusion that the LMS model has substantial support in biologic theory and
8Note that these 7 advantages of conservatism are not an exhaustive list. Others that could have been discussed include: this proposal is close to what EPA already does; it jibes with the rest of the CAPRA report; it is also motivated by some pure management issues, notably the potential problem of a bias towards exaggeration in the cost figures that risk estimates are compared to.
observational data (so it cannot be rejected as "absolutely implausible"); (2) the scientific conclusion that no other extant model has so much more grounding in theory and observation so as to make the LMS fail a test of "relative plausibility"; and (3) the empirical observation that the LMS model gives more conservative results than other plausible models.9
Step 2 Armed with this set of scientifically supportable and health-protective models, EPA should then strive to amass and communicate information about the uncertainty and variability in the parameters that drive these models.10 The uncertainty distributions that result from such analyses will permit the risk manager to openly choose a level of conservatism concordant with the particular statutory, regulatory, and economic framework, confident that regardless of the level of conservatism chosen, the risk estimate will reflect an underlying scientific structure that is both plausible and designed to avoid the gross underestimation of risk. In Chapters 9 and 11, the committee supports this notion that the level of conservatism should be chosen quantitatively with reference to parameter uncertainty and variability, but qualitatively with reference to model uncertainty (i.e., under this proposal, models would be chosen to represent the "conservative end of the spectrum of plausible models"). Although the "plausible conservatism" proposal per se was not unanimously agreed to, the entire committee does share the concern that attempts to precisely fine-tune the level of conservatism implicit in the model structure may lead to implausible or illogical compromises that advance neither the values of prudence nor of scientific integrity.
Step 3 EPA should then undertake two related activities to ensure that its resulting risk estimates are not needlessly conservative, or misunderstood by
9EPA should be mindful of the distinction between "plausible as a general rule" and "plausible as an occasional exception" in choosing its generic defaults, and only consider the former at this stage (i.e., if a particular model is not plausible as a means of explaining the general case, it should be reserved for consideration in specific situations where a departure may be appropriate). For example, a more conservative model than the LMS model, a "superlinear" polynomial allowing for fractional powers of exposure (Bailar et al., 1988), may be plausible for certain individual chemicals but appears at present not to pass a consensus threshold of scientific plausibility as a generic rule to explain all exposure-response relationships. On the other hand, less conservative models such as the M-V-K model do cross this threshold as plausible-in-general but would not yet qualify as appropriate generic defaults under the "plausible conservatism" principle.
10As the committee discusses in its recommendations regarding "iteration," the level of effort devoted to supplanting point estimates of parameters with their corresponding uncertainty or variability distributions should be a function of the "tier" dictated by the type and importance of the risk management decision. For screening analyses, conservative point estimates within the rubric of the prevailing models will serve the needs of the decision, whereas for higher-tier analyses uncertainty distributions will be needed.
some or all of its audience. These steps are important even though by definition, risk estimates emerging from a framework of "plausible conservatism" cannot be ruled out as flatly impossible without some empirical basis (since they are based on a series of assumptions, each of which has some scientific support, the chain of assumptions must also be logically plausible, if perhaps unlikely). As some observers have pointed out, however, such estimates may be higher than some judge as necessary to support precautionary decisions (Nichols and Zeckhauser, 1988; OMB, 1990). A quantitative treatment of uncertainty and an explicit choice of the level of conservatism with respect to parameter uncertainty, as recommended here and in Chapter 9, will help minimize this potential problem. EPA can mitigate these concerns still further by: (1) calibrating its risk estimates against available "reality checks," such as the upper confidence limit on human carcinogenic potency one can sometimes derive in the absence of positive epidemiologic data (Tollefson et al., 1990; Goodman and Wilson, 1991) or physical or observational constraints on the emissions estimates used or the ambient concentration estimates generated by the exposure models used; and (2) clearly communicating that its risk estimates are intended to be conservative (and are based on plausible but precautionary assumptions). In improving its risk communication, EPA should try to avoid either underestimating the level of conservatism (e.g., EPA's current tendency to imply that its estimates are "95th percentile upper bounds" when they really comprise several such inputs that, in combination with other nonconservative inputs, might still yield an output more conservative than the 95th percentile) or overstating the amount of conservatism (e.g., EPA's tendency to state that all its potency estimates "could be as low as zero" even in cases when there is little or no support for a threshold model or when the estimates are based on human data). In essence, the thrust of this step of our proposal is to further distinguish between the concepts of prudence and misestimation discussed above, and to discourage the latter practice so that critics of conservatism will have to come to grips with (or abandon) their opposition to the former.
Step 4 Finally, (a point to which the entire committee agreed) EPA should clarify its standard for how it decides it should replace an existing default assumption with an alternative (either as a general rule or for a specific substance or class of substances). Currently, EPA only uses language implying that each default shall remain in force "in the absence of evidence to the contrary," without any guidance as to what quality or quantity of evidence is sufficient to spur a departure or how to gauge these attributes (or, of course, any guidance if any principle other than one of evidentiary quality should govern the choice among alternatives). Here, a specific test for structuring departures from defaults is proposed. Specifically, EPA should go on record as supporting departures from defaults whenever "there exists persuasive evidence, as reflected in a general consensus of knowledgeable scientists, that the alternative assumption (model)
represents the conservative end of the spectrum of plausible assumptions (models)." This language was carefully chosen, based on substantial debate within the committee, to achieve several objectives:
11In special circumstances, a new scientific consensus may emerge that a model or assumption that is more conservative than the default is clearly plausible, either as a general rule or for specific chemicals or exposure scenarios. In such cases, the absolute amount of conservatism will increase. Although this asymmetry results in a de facto lower procedural threshold for adopting more conservative models than less conservative ones, the requirement implicit in the standard for a consensus about plausibility should limit the frequency with which the former type of departures will occur.
Pitfalls Of Our Proposal; Comparison With Alternatives
Some of the criticisms raised against conservatism in risk assessment have substantial merit, and are applicable to this proposal to include conservatism in the choice of default options. EPA can minimize some of these pitfalls by following other recommendations made in this appendix and elsewhere in the report. For example, the problem that conservatism can lead to incorrect risk comparisons and priority-setting decisions can be remedied in part by striving to make the "level of conservatism" explicit and roughly constant across assessments, and by generating additional estimates of central tendency (perhaps even derived via subjective weights applied to different basic biological theories) for use in ranking exercises only.13 Similarly, there is a legitimate concern that the policy of conservatism can stifle research if EPA is perceived as uninterested in
12The only important caveat to this principle, which would apply to the transport model example as well as the PBPK example, is that with the addition of new model parameters (e.g., partition coefficients and rate constants in the PBPK case), the uncertainty and interindividual variability in those parameters must be estimated and incorporated into an explicit choice of a level of conservatism (see recommendation in Chapter 9).
13We note that risk ranking under uncertainty is a complicated and error-prone process, regardless of whether conservative, average, or other point estimates are used to summarize each risk. The medians or means of two risk distributions can be in one rank order while the upper bounds could well be in the opposite order; no single ranking alone is correct.
any new information that might show the risk has been overstated; the emphasis here on scientific consensus does tend to slow the adoption of less conservative models at their early stages of development, but this should neither discourage thorough research nor discourage researchers from submitting quality data which EPA could readily incorporate into its existing model structure regardless of what effect it would have on the risk estimate.
The fundamental concern about conservatism is that it has led to systematic exaggeration of all environmental health problems and has encouraged wasting of scarce resources on trivial risks. The latter part of this charge is a subjective matter of economic and social policy that falls outside this committee's purview. And while the former concern is an empirical one, it has sparked a vigorous debate that is far from resolved. On one side, those convinced that EPA's procedures yield estimates far above the true values of risk can cite numerous examples where individual assumptions seem to each contribute more and more conservatism (Nichols and Zeckhauser, 1988; OMB, 1990; Hazardous Waste Cleanup Project, 1993). Others believe the evidence shows that current procedures embody a mix of conservative, neutral, and anti-conservative assumptions, and that the limited observational "reality checks" available suggest that existing exposure, potency, and risk estimates are in fact not markedly conservative (Allen et al., 1988; Bailar et al., 1988; Goodman and Wilson, 1991; Finley and Paustenbach, in press; Cullen, in press).
The practical and constructive question EPA must grapple with, however, is not whether "plausible conservatism" is ideal, but whether it is preferable to the alternative(s). The primary alternative to this proposal (Appendix N-2) directs EPA risk assessors to use defaults on the basis of the "best available scientific information," with the apparent goal of generating central-tendency estimates (CTEs) of risk. According to proponents of this approach, there is a clear boundary line between the "objective" activity of risk assessment and the value-laden activity of risk management, and the imposition of conservatism (if any) should occur in the latter phase, with managers adding "margins of safety" to make precautionary decisions out of the CTEs. In comparing this proposal with the alternative, it is important to consider the two fundations of the latter approach, the CTE (or "most scientific estimate'') and the margin of safety, and ask whether either concept is really as appealing as it may sound.
The margin of safety idea is problematic, for one obvious reason: it is only through exploring the conservative models and parameter values that analysts or managers can have any idea what they are trying to be "safe" from. Perhaps it would be ideal for the manager rather than the assessor always to tailor the level of conservatism, but in reality, only the assessor can initially determine for the manager what a "conservative decision" would entail, because the assessor has the access to information on the spectrum of plausible values of risk. Applying any kind of generic safety factor to CTEs of risk would certainly result in a haphazard series of decisions, some (much) more conservative than a reasonable
degree of prudence would call for, others (much) less so. Besides, taken as a whole the committee's report returns some discretion and responsibility to the risk manager that assessors have admittedly usurped in the past by presenting point estimates alone. The committee's emphasis on quantitative uncertainty and variability analysis gives risk managers the ability to tailor decisions so that the degree of protection (and the confidence with which it can be ensured) are only as stringent as they desire. But in the narrow area of model uncertainty, this proposal deems it unwise to encourage risk managers to guess at what a protective decision would be, by censoring information about models which, although conservative, are still deemed by experts to be plausibly true.
The CTE also has potentially fatal problems associated with it. Even if the models used to construct CTEs are based on "good science," we have argued (above) that these estimates are not designed to predict the expected value of potency, exposure, or risk (for which the conservative end of the spectrum must be explored and folded in), but instead are surrogates for other central-tendency estimators such as the median or mode (maximum likelihood). These latter estimators generally do not even give neutral weight to errors of underestimation and overestimation, and hence must be regarded as "anti-conservative." But advocates of CTEs have also failed to consider the problems of the models from whence they come. The following are four examples, illustrating four archetypes of central-tendency estimation, which suggest that on a case-by-case basis, ''good science" may not be all its proponents advertise it to be:
Case 1: "More science" merely means more data. Some of the alternative CTE estimates advocated by critics of conservatism are alleged to be more scientific because they make use of "all the data at hand." This distinction is hardly a cut-and-dried one, however. For example, consider the current EPA default of using the bioassay result from the most sensitive of the (usually no more than four) sex-species combination of rodent tested. Call this potency estimate "A," and the alternative that could be derived by pooling all (four) data sets as "ABCD." Assuming that we know very little about the relative susceptibilities of different varieties of rodents versus the average human (in general or for the particular substance at issue), we must logically admit that it is possible the true risk to the average human may be greater than that implied by A, less than that implied by ABCD, or somewhere in between. One could prefer ABCD to A on the basis of a different value judgment about the costs of overestimation and underestimation, but the only "scientific" difference is that ABCD makes use of more data. But "purchasing" an array of data is akin to buying cards in a blackjack game: "more is better" only holds true as long as all the individual elements are valuable rather than otherwise. Assuming rodent varieties A through D differ significantly (or we wouldn't be quarreling over the two estimators), then humans must either be most like variety A or most like one of the other three. If the former, then data points B, C, and D dilute and ruin what is already in fact
the "best estimate"; if the latter, then more is indeed better (in the sense of moving us closer to the truth). Therefore, EPA's true dilemma is whether the additional data are more likely to hurt or to help, and this too is a policy judgment about balancing estimation errors, not a simple matter of "good science." However, as a matter of process and of implementation, there is a clear difference between a policy of choosing A and a policy of choosing ABCD. The former policy sets up incentives to actually advance the scientific foundation and get to the truth of which sex/species is the best predictor in specific or in general; when such information becomes available, "good science" will justifiably carry the day. On the other hand, the latter policy only encourages additional rote application of current bioassay designs to generate more data that assessors can pool.
A related example in the exposure assessment arena is discussed in Chapter 10. A CTE of approximately 7 years of exposure to a typical stationary source of toxic air pollutants is indeed based on much more data (in this case, data on the variation in the number of years a person stays at one residence before moving) than is the standard 70-year assumption EPA has used. But as noted in Chapter 10, those data, although valid at face value, may speak to a different question than the one EPA must address. To ensure that individual lifetime risk is correctly calculated in a nation containing thousands of such sources, EPA would need to consider data not only on years at one residence, but also on the likelihood (that we consider substantial) that when someone moved away from proximity to a source, he or she would move to an area where there is still exposure to the same or similar carcinogens. In both examples, "a great deal more data" (on interspecies susceptibility or on autocorrelation of exposure rates as people move, respectively) would certainly be preferable to EPA's status quo assumption, but questions arise as to whether "a little more data" help or hurt the realism of the calculations.
Case 2: "More science" means constructing chimeras out of incompatible theories. One brand of CTE that has gained some currency in recent years allegedly provides a means of incorporating all of the plausible scientific models, much as meta-analysis incorporates all of the available epidemiologic studies or bioassays on a particular compound. Unfortunately, there may be a world of difference between pooling related data sets and averaging incompatible theories. In Chapter 9, we discuss the obvious pitfalls of such hybrid CTEs, which arguably confuse rather than enrich the information base from which the risk manager can choose a course of action. For example, when faced with two conflicting theories about the potency of TCDD, EPA arguably should not have tried to change its potency estimate to "split the difference" between the two theories and make it appear that new science had motivated this change (Finkel, 1988). Rather, EPA could have achieved the same risk management objective by loosening regulatory standards on TCDD if it felt it could justify this on the grounds that there was a significant probability that the existing risk estimate
was excessively conservative. The committee could not agree on what sort of advice to give decisionmakers when some risk is either zero (or nearly zero) or is at some unacceptably high level X, depending on which of two fundamentally incompatible biologic theories is in fact the correct one. The committee did agree, however, that analysis should certainly not report only a point estimate of risk equal to (1-p)X, where p is the subjective probability assigned to the chance that the risk is (near) zero. In the specific context of default options, this proposal remains that EPA should retain its "plausible conservative" default until scientific consensus emerges that the alternative model supplants the default at the conservative end of the plausible set of model choices.
Case 3: "More science" means introducing more data-intensive models without considering uncertainty or variability in the parameters that drive them. This particular problem should be easy to rectify by incorporating the committee's recommendations in Chapters 9 and 10, but it is mentioned here because to date EPA has considered several departures from defaults (e.g., the case of methylene chloride, at least as interpreted by Portier and Kaplan, 1989) in which the level of conservatism may have changed abruptly because the parameters of the default model were assessed conservatively, but the parameters in the new model were either CTEs or point estimates of unknown conservatism. All of the burden should not fall upon purveyors of new models, however; EPA needs to level the playing field itself by systematically exploring the conservatism inherent in the parameters of its default models (for example, as we discuss in Chapter 11, is the surface area or 3/4 power correction a conservative estimate of interspecies scaling, or something else?).
Case 4: "More science" is clearly an improvement but not airtight. It is noteworthy that the most detailed case-specific reassessment of a default assumption, the CIGA case discussed in Chapter 6, has recently been called into question on the grounds that the new science casts serious doubt upon EPA's default as applied to existing animal data, but does not itself provide unimpeachable support for an alternative risk estimate (Melnick, 1993). We do not presume to reach any conclusion about this dispute, or about its implications for the general process of departing from defaults. As a matter of process, the CIGA case would probably meet the "persuasive evidence" test recommended here, and therefore one should not necessarily characterize EPA's acceptance of this new science as a mistake in policy. However, for purposes of risk communication, EPA should understand and emphasize that scientific consensus in issues such as these does not necessarily imply scientific truth.
In summary, EPA's choice between competing principles for choosing and departing from defaults has important and provocative implications for four areas of environmental science and EPA programs.
The alternative view which follows (Appendix N-2) was written after this Appendix was completed. Together, these two statements reflect reasoned disagreement which I hope will provide EPA with "grist for the mill" to help it resolve important questions about risk assessment principles and model uncertainty. However, there are a number of inconsistencies and misinterpretations in Appendix N-2 that I believe cloud this debate. Some of the ambiguity stems from the lack of responsiveness to important issues raised in this Appendix. For example, Appendix N-2 asserts that "risk managers should not be restricted by value judgments made during risk assessment," but nowhere does it explain how this vision could be realized, in light of the assertions herein that a vague call for ''full use of scientific information" must either impose a set of value judgments of its own or else restrict risk assessors to presenting every conceivable interpretation of every model, data set, and observation.14 Similarly, the statement that "risk characterizations must be as accurate as possible," and the implicit equating of accuracy with the amount of data amassed, responds neither to the assertion that accuracy may not be the most appropriate response to uncertainty nor to the four examples in Appendix N-1 showing that "more science" may lead to less accuracy as well as substitute risk-neutral or risk-prone value judgments for risk-averse ones.
There are legitimate reasons for concern about a principle of "plausible conservatism," concerns that, if anything, might have been strengthened by more specificity in Appendix N-2 about the putative merits of an alternative. But in at least three respects, the material in N-2 misinterprets the stated intent of the "plausible conservatism" proposal, thus making a fair comparison impossible.
14In fact, the Appendix contradicts itself a few pages later when it states that "weighing the plausibility of alternatives is a highly judgmental evaluation that must be carried out by scientists." This is a clear call for scientists to play a role in science policy, which Appendix N-1 clearly endorses, but then the authors of N-2 return to the "hands off" view and re-contradict themselves with the admonition that "scientists should not attempt to resolve risk management disputes by influencing the choice of default options."
15 A substantial amount of uncertainty may be contributed by the parameters that drive risk models, even before interindividual variability is taken into account. For example, even if one specifies that the linearized multistage model must be used, the uncertainty in cancer potency due only to random sampling error in the typical bioassay can span five orders of magnitude at a 90 percent confidence level (Guess et al., 1977).
In contrast to some of the issues raised above, where there really is less disagreement that Appendix N-2 indicates, here there is more controversy than Appendix N-2 admits to. Our lack of consensus on this most fundamental issuehow to choose and how to modify default optionsis what caused the committee to decide not to recommend any principles for meeting these challenges.
Allen, B.C., K.S. Crump, and A.M. Shipp. 1988. Correlation between carcinogenic potency of chemicals in animals and humans. Risk Anal. 8:531-544.
Bailar, J.C., III, E.A. Crouch, R. Shaikh, and D. Spiegelman. 1988. One-hit models of carcinogenesis: Conservative or not? Risk Anal. 8:485-497.
Bell, D. 1982. Regret in decision-making under uncertainty. Operations Res. 30:961-981.
Cullen, A. In press. Measures of compounding conservatism is probablistic risk assessment. Risk Anal.
EPA (U.S. Environmental Protection Agency). 1986. Guidelines for carcinogen risk assessment. Fed. Regist. 51:33992-34003.
Finkel, A. 1988. Dioxin: Are we safer now than before? Risk Anal. 8:161-165.
Finkel, A. 1989. Is risk assessment really too "conservative?": Revising the revisionists. Columbia J. Environ. Law 14:427-467.
Finley, B., and D. Paustenbach. In press. The benefits of probabilistic techniques in health risk assessment: Three case studies involving contaminated air, water, and soil. Paper presented at the National Academy of Sciences Symposium on Improving Exposure Assessment, Feb. 14-16, 1992, Washington, D.C. Risk Anal.
Goodman, G., and R. Wilson. 1991. Quantitative prediction of human cancer risk from rodent carcinogenic potencies: A closer look at the epidemiological evidence for some chemicals not definitively carcinogenic in humans. Regul. Toxicol. Pharmacol. 14:118-146.
Graham, J.D., B.-H. Chang, and J.S. Evans. 1992. Poorer is riskier. Risk Anal. 12:333-337.
Hazardous Waste Cleanup Project. 1993. Exaggerating Risk: How EPA's Risk Assessments Distort the Facts at Superfund Sites Throughout the United States. Hazardous Waste Cleanup Project, Washington, D.C.
MacRae, J.B., Jr. 1992. Statement of James B. MacRae, Jr., Acting Administrator, Office of Information and Regulatory Affairs, U.S. Office of Management and Budget. Hearing before the Committee on Government Affairs, U.S. Senate, March 19, Washington, D.C.
Melnick, R.L. 1992. An alternative hypothesis on the role of chemically induced protein droplet (2-globulin) nephropathy in renal carcinogenesis. Regul. Toxicol. Pharmacol. 16:111-125.
Nichols, A., and R. Zeckhauser. 1988. The perils of prudence: How conservative risk assessments distort regulation. Regul. Toxicol. Pharmacol. 8:61-75.
NRC (National Research Council). 1991. Environmental Epidemiology. Public Health and Hazardous Wastes, Vol. 1. Washington, D.C.: National Academy Press.
OMB (U.S. Office of Management and Budget). 1990. Regulatory Program of the U.S. Government, April 1, 1990-March 31, 1991. Washington, D.C.: U.S. Government Printing Office.
Portier, C.J., and N.L. Kaplan. 1989. The variability of safe dose estimates when using complicated models of the carcinogenic process. A case study: Methylene chloride. Fundam. Appl. Toxicol. 13:533-544.
Tollefson, L., R.J. Lorentzen, R.N. Brown, and J.A. Springer. 1990. Comparison of the cancer risk of methylene chloride predicted from animal bioassay data with the epidemiologic evidence. Risk Anal. 10:429-435.