Page 515
Appendix
I
This appendix is split into three parts. The first discusses aggregate risk of occurrence of one or more nonthreshold, quantal, toxic end points caused by exposure to multiple agents (assuming independent actions). The second is a summary assessment of independence in interanimal tumortype occurrence in the NTP rodentbioassay database. The third discusses methods for aggregating uncertainty and interindividual variability in predicted risk.
Page 516
Appendix
I1
Aggregate Risk of Nonthreshold, Quantal, Toxic End Points Caused by
Exposure to Multiple Agents (Assuming Independent Actions)
The aggregate increased probability P of occurrence of any of n (presumed) nonthreshold end points caused by exposure to an environmental mixture of m toxic agents may be conveniently expressed under a few general assumptions. First, assume that the m agents are present in an environmental mixture at corresponding concentrations Ci, where i = 1,2,…,m, each of which produce, in exposed people, corresponding lifetime, timeweighted average biologically effective dose rates Dij, each causing one or more of n quantal (all or none) toxic end points Tj, where j = 1,2,…,n (see Figure I1). Let Oij denote the occurrence of a particular jth end point Tj induced by effective dose rate Dij, and assume that Tj has a background occurrence probability of pj = Prob(Oij ¦ D=0) for total effective dose D due to all relevant agents and that Oij may arise only by events independent of those giving rise to either the background incidence rate of Tj or to events Ogh for any g and h such that g≠i, 1≤g≤m, h≠j, and 1≤h≤n. Finally, for very small values of Dij, assume that the corresponding increased probability of occurrence of the Tj is defined by an independent ''onehit" (nonthreshold, lowdose linear) function of Dij. In the following, ∩, ∪, and the overbar denote the logical union, intersection, and negation operations, respectively.
It follows from the stated assumptions and definitions that a Dijinduced increased probability Pij of Tj occurrence, conditional on its independent background rate pj, is:
(1)
Page 517
in which qij (the linear coefficient in dose) is the parameter characterizing the "potency" (or lowdose increased occurrence probability per unit dose) of compound i for inducing end point Tj. Under the stated assumptions, Pij=qij=0 for any jth end point Tj that is unaffected by Dij alone, regardless of concurrent doses from any other agents. The quantity of interest—aggregate increased probability P of occurrence of any of the n end points caused by any of the m toxic agents—may therefore be expressed as
which, by de Morgan's rule, may be rewritten as
from which, by Equation 1 and the independence assumption, it follows that
For very small values of P (‹‹1) relevant to environmental regulatory concern, P is well approximated by
Page 518
If no information is available concerning targettissuespecific pharmacokinetics, Dij is sometimes taken to be either the absorbed dose rate (e.g., milligrams of agent i absorbed per kilogram of body weight per day) or a wholebody surrogate for effective dose (e.g., estimated milligrams of agent i metabolized per kilogram of body weight per day), that is, a measure of dose identical for all of the particular toxic end point(s) considered. In this case, Dij=Di is independent of j for any given ith agent, such that Equation 6 may be rewritten as
in which Qi is the sum of qij for j ranging from 1 to n and represents the aggregate potency of agent i for inducing at least one of the n end points considered.
When applying relations like those represented by Equations 57, q, Q, D, and hence P may represent quantities subject to uncertainty or interindividual variability characterized by different probability distributions. If distributed variates are involved, a meaningful confidence bound on P cannot generally be obtained by performing the indicated summations with the same bound on all values of q, Q, and D. In the special case that, say, Qi and Di in Equation 7 are all independent and m is sufficiently large, the estimate of P will tend to be normally distributed; however, asymptotic normality is not likely to be useful in situations involving relatively small m and n. If a statistical upper confidence bound is desired is desired for P, Monte Carlo procedures will therefore generally be needed.
Page 519
Appendix
I2
Independence in InterAnimal TumorType Occurrence in the NTP
RodentBioassay Database
Animal cancer bioassay data have been used as the basis for estimating carcinogenic potency (i.e., increased risk per unit dose at very low doses) of a chemical to which a human of average cancer susceptibility might be exposed over a lifetime (Anderson et al., 1983; EPA, 1986, 1992). The bioassay data available may indicate that multiple tumor types are induced in exposed bioassay animals. In this case, it is generally desired to estimate the aggregate cancer potency exhibited by the compound in the bioassay animals, that is, the effectiveness in the experimental animals of the compound in eliciting any one or more of the elevated tumor types. The estimated aggregate cancer potency in bioassay animals may then be used to extrapolate a corresponding potency of that compound in a human of average susceptibility (EPA, 1986, 1992). Neither this interspecies extrapolation nor the issue of human interindividual variability in cancer susceptibility (discussed in Chapter 10) are the subject of this appendix (I2). Rather, this appendix focuses on the extent of tumortype correlations in bioassay animals, which in turn bears on the question of how properly to estimate the aggregate cancer potency of a compound exhibited in bioassay animals for a compound that induces multiple tumor types.
One approach to estimating aggregate cancer potency in bioassay animals has been to apply a doseresponse model to tumorincidence rates with the numerators defined as the number of animals with one or more of the histologically distinct and significantly elevated tumor types (EPA, 1986). By this procedure, either a control or a dosed animal with multiple tumor types counts the same as an animal with only a single tumor type. If the tumor types occur in a statistically independent fashion among the bioassay animals tested, it follows that this procedure may under or overestimate true aggregate potency because it has the
Page 520
effect of randomly excluding tumorresponse information concerning both control and dosed animals (Bogen, 1990).
If potency is estimated using a multistage model (which is in effect a onehit model at very low doses), and if tumor types assort independently among the animals tested, the statistical problem raised by EPA's tumorpooling approach is avoided completely if aggregate potency is instead estimated as the sum of tumortypespecific (that is, independentendpointspecific) potencies (see Appendix I1). This alternative to EPA's procedure, however, depends on the validity of the independence assumption regarding tumortype occurrence within bioassay animals, which is the subject of this appendix (I2).
In some of the few studies that have focused on tumortype associations within individual animals, a few significant associations have been noted, mostly negative associations involving one or two specific tumor types among associated pairs. Significant (p‹0.05) age and treatmentadjusted associations for five of 21 sexspecific pairs from six tumor types investigated were reported by Breslow et al. (1974) for experiments involving over 4000 CF1 mice exposed to DDT, urethane or nothing: negative associations between lymphomas and each of hepatomas (males), lung adenomas (males and females), and mammary and ovarian tumors (females), and a positive association between lymphomas and bone tumors (males). (Upon adjustment for multiple significance tests (Wright, 1992), the association between lymphomas and mammary tumors observed in that study may not be significant at the 0.05 level.) Breslow et al. (1974) suggested that the negative lymphomarelated associations, except perhaps those involving liver tumors, were all likely to be spurious, "due to the relative rapidity with which lymphomas tend to kill their bearers." A significant negative association between lymphomas and liver tumors (but not lung tumors) in 1478 similarly exposed CF1 mice was later confirmed, even after accounting for the relatively rapid lymphoma lethality by use of serial sacrifice information (Wahrendorf, 1983). A significant negative correlation between malignant lymphoma and proliferative hepatocellular lesions at death/sacrifice was also found among 1858 male ICI mice (Young and Gries, 1984). Haseman (1983) also noted this significant negative correlation in raw tumorincidence data for F344 rats from 25 National Toxicology Program (NTP) bioassays (not analyzed at the level of individual animals).
The most comprehensive study of this type, involving an examination of age and treatmentadjusted associations between (66 possible) pairs of 12 tumor types at death/sacrifice in 3813 gammairradiated female BALB/c mice, reported 21 significant (p‹0.05) positive or negative associations, 10 of which were negative and involved reticular tumors considered to be rapidly lethal and generally also involved other tumors considered to be lethal in the animals studied; most of these 10 associations were considered to be spurious due to the effect of lethality (Storer, 1982). The remaining associations considered significant generally were positive and involved endocrinerelated tumors (Harderian, mammary, adrenal,
Page 521
and pituitary tumors), and none of these involved liver tumors. Aside from associations involving reticular sarcomas, and after appropriate statistical adjustment for multiple tests of significance (Wright, 1992), only three of 55 remaining possible associations reported by Storer (1982) appear to be significant at a 0.05 level, all involving Harderiangland tumors, which, along with ovary, adrenal and pituitary tumors, were all considered to be nonlethal in the animals studied. A recent study of livertumor and reticulumcellsarcoma incidence in 1004 gammairradiated female C3H mice supported a significant negative correlation of these tumor types, even after adjustment for the relative lethality of the reticular tumors using causeofdeath information available in that study (Mitchell and Turnbull, 1990).
In other smaller studies, an assumption of independence in tumortypes at death/sacrifice was shown to be consistent with ED01 data on four different tumor types in 366 control female BALB/c mice and six tumor types elevated in 193 such mice exposed to 2acetylaminofluorine (Finkelstein and Schoenfeld, 1989), as well as with Hazelton Laboratory data on three different tumor types elevated in a total of 142 male albino rats exposed to dibromochloropropane (Bogen, 1990).
No comprehensive study of animalspecific tumortype occurrences at death/sacrifice has been conducted using the extensive set of available NTP rodentbioassay data, on which most cancerpotency assessment for environmental chemicals is currently based. This report presents the results of such an analysis (Bogen and Seilkop, 1993) conducted on behalf of the National Research Council's Committee on Risk Assessment for Hazardous Air Pollutants.
Data Description
Tumortype associations among individual animals were examined for both control and treated animals using pathology data from 62 B6C3F1 mouse studies and 61 F/344N rat studies obtained from a readily available subset of the NTP carcinogenesis bioassay database. Most studies were 2year studies, although a few were shorter (e.g., 15 months). Separate analyses were conducted for the four sex/species combinations (male and female mice, male and female rats) corresponding to the compounds and species indicated in Table I1. Analysis was confined to the following common tumor types (occurring at a rate ›5%):

Page 522


Page 523

Page 524
Analyses of correlations between tumor occurrence in treated animals were based on subsets of the controlanimal data, comprising studies for which the NTP declared "clear evidence" of an effect at multiple sites and for which pairs of such effects were exhibited in more than one study (resulting in the use of five rat studies and four mouse studies). The treated animal studies involved tumor types that differed from the controlanimal studies, namely, adenomas or carcinomas of: liver, Zymbal's gland, clitoral /preputial gland, and skin in rats; and liver, adrenal and Harderian gland in mice. In both control and treatedanimal analyses, evidence of associations from individual studies were pooled as described below.
Statistical Methods
Associations among statistically significantly elevated tumortypes within individual animals may pertain either to tumor onset probabilities or to prevalence at death/sacrifice or to both. It is well known that associations present at death/sacrifice may differ, sometimes substantially, from those relating to tumor onset, and that the former may be heavily influenced by the latter as a result of the timedependent action of competing risks (Hoel and Walburg, 1972; Breslow et al., 1974; Wahrendorf, 1983; Lagakos and Ryan, 1985). For example, if the onset probabilities of two different tumor types are statistically independent, but in addition both are rapidly lethal, then there is little probability of their joint occurrence within an individual animal and thus their prevalence at death/sacrifice will be negatively correlated. This fact was the basis for concluding probable "spurious" negative correlations involving rapidly lethal tumor types in previous assessments of tumortype associations in rodents (Breslow et al., 1974; Storer, 1972).
Unambiguous detection of associations in onsets of different tumor types requires either serialsacrifice information or animal and tumorspecific lethality information (Hoel and Walburg, 1972; Wahrendorf, 1983; Lagakos and Ryan, 1985; Mitchell and Turnbull, 1990), neither of which is available for the NTP data analyzed here. Thus, the present analysis was primarily restricted to an assessment of ageadjusted correlations in tumortypes present at death/sacrifice. This approach provides definitive information on onset (as well as terminal prevalence) correlations only if all tumor types are incidental to fatality. However, as described below, a crude assessment of onsetprobability correlations was also conducted using information on tumor lethality obtained from the data studied.
Evaluation of the correlations between occurrences between pairs of tumor types in individual animals observed at death/sacrifice was based on ageadjustment of information from 24 previous similar studies (Breslow et al., 1974; Storer, 1982; Young and Gries, 1984; Finkelstein and Schoenfeld, 1989). Five survivalage strata within each study were used: (1) first 365 days, (2) 366546 days (1.5 years), (3) 547644 days (~1.75 years), (4) 644terminal sacrifice (~2
Page 525
years), (5) terminal sacrifice. Further stratification addressed the inclusion of the highest two dose groups. Thus, the potential number of analytical strata (i.e., 24 times 2 (the number of dose levels). The method of Mantel and Haenszel (1959) was used to combine results from stratumspecific contingency tables and to assess twotailed significance of overall associations between tumor occurrences. Overall correlations are represented as the weighted averages of corresponding stratumspecific measures, using the numbers of animals in the strata as weights. Adjusted pvalues accounting for multiple tests of a zerocorrelation null hypothesis were obtained for all control and all treated rats and mice using Hommel's modified Bonferroni procedure (Wright, 1992).
In the absence of serial sacrifice or lethality information, associations between onsets of pairs of tumor types in individual NTPbioassay animals were evaluated using two crude techniques. First, a separate correlation analysis was undertaken as above, but using only terminal sacrifice data. This approach provides definitive information on onset (as well as terminal prevalence) correlations only if no animals die prior to terminal sacrifice, but may nevertheless provide meaningful information if a sufficiently large fraction of animals survive until sacrifice. The second approach used was the threebythree contingencytable method for detection of diseaseonset associations devised by Mitchell and Turnbull (1990), which requires lethality determinations for each tumor occurrence in each animal. When in doubt regarding such lethality, Mitchell and Turnbull (1990) recommend that it would be prudent to classify a particular occurrence as lethal, because while doing so falsely may reduce the power of the test, the null distribution will not be affected. Thus, the MitchellTurnbull test was applied under the assumption that all occurrences of a given tumor type were lethal for all plausibly lethal tumor types. Tumortype lethality was investigated using MannWhitney U statistics comparing survival times of tumorbearing and tumorfree animals, where all studyspecific results for a given control or treated species and sex were combined to form an overall test by summing these U statistics and dividing this sum by the square root of the sum of the corresponding variances.
Results And Discussion
The results of our analysis of correlations in incidence at death/sacrifice of tumor types in control rats and mice are summarized in Table I2. These results indicate four significant (p* ‹ 0.05) but small correlations among 20 sex/tumortypepairs investigated in rats (pituitary vs. leukemia in both sexes, and mammary vs. leukemia or pituitary in females—where all those involving leukemia were negative), and no similarly significant correlations among 12 sex/tumortypepairs investigated in mice. Corresponding results for treated rats and mice are summarized in Table I3. Significant (p* ‹ 0.05) but again generally quite small correlations appear present for two of 12 sex/tumortypepairs investigated
Page 526

Page 527

in treated rats (Zymbal's vs.preputial gland and liver vs. skin tumors in males) and for one of four sex/tumortypepairs investigated in treated mice (liver vs. Harderian gland in females), where the liverrelated correlations were both positive.
Terminalsacrifice animals represented 66 to 68% of all the control mice and 53 to 63% of all control rats referred to in Table I2. Analysis of tumortypeprevalence correlations in only these animals revealed only a single significant (p* ‹ 0.05) correlation, that between mammary and pituitary tumors in female rats (r=0.080, p*=0.013). Thus, the latter positive (albeit quite small) correlation may pertain to onset as well as prevalenceatdeath/sacrifice correla
Page 528
tions, whereas the negative leukemiarelated correlations noted above for all control rats did not persist in terminalsacrifice animals. This finding could be explained by relative lethality associated with rodent leukemia/lymphoma, which has been noted in previous studies (Breslow et al., 1974; Wahrendorf, 1983; Young and Gries, 1984; Portier et al., 1986). Terminalsacrifice animals represented only 14 to 16% of all the treated rats and 20 to 55% of all treated mice referred to in Table I3. Correlation analyses for these treated animals yielded no significant (p* ‹ 0.05) correlations, which sheds less light on tumoronset associations given the greater nonrepresentativeness of these animals.
Our examination of differences in survival time in animals with particular tumors vs. tumorfree animals revealed a few significant differences in control and treated rats. Leukemia in both sexes of control F344 rats studied was associated with a significant reduction in mean survival time (p‹0.001). However, this reduction was rather modest: 75% of leukemia bearing animals lived until the 23rd month of the studies and 50% lived until terminal sacrifice. In contrast, 75% of the leukemiafree animals survived until terminal sacrifice. Thus, any effect of leukemia lethality in inducing negative correlations with other cancers is likely to be small.
There was also evidence that Zymbal's gland tumors in treated rats resulted in reduced survival times (males, p‹0.001; females, p=0.003), where the median survival times were reduced by about four months in males (546 vs. 427 days—reduction for more striking than that for leukemia in control males) and by about one month in females. When leukemia and Zymbal's gland tumors in animals dying before terminal sacrifice were assumed to be lethal and all other tumor types incidental, the MitchellTurnbull test yielded similar results to those obtained using the unmodified agestratified analysis. In particular, it provided strong evidence that the small, negative associations between leukemia and pituitarygland tumors in control rats were not due to chance or to differential lethality (males, p‹109; females, p=0.000057), and it indicated the same regarding the small, negative associations between Zymbal'sgland tumors and preputial/clitoralgland tumors in treated rats ((males, p=0.009; females, p=0.002).
In summary, no evidence was found for any large correlation in either the onset probability or the prevalenceatdeath/sacrifice of any tumortype pair investigated in control and treated rats and mice, although a few of the small correlations present were statistically significant. This finding must be qualified to the extent that tumortype onset correlations were measured indirectly given the limited nature of the data analyzed. Taken together, these findings indicate that tumortype occurrences in B6C3F1 mice and F344 rats used in the NTP bioassays analyzed were in most cases nearly independent, and that departures from independence, where they did occur, were small.
Page 529
References
Anderson, E.L., and the Carcinogen Assessment Group of the U.S. Environmental Protection Agency. 1983. Quantitative approaches in use to assess cancer risk. Risk Anal. 3:277295.
Bogen, K.T. 1990. Uncertainty in Environmental Health Risk Assessment. New York: Garland Publishing.
Bogen, K.T., and S. Seilkop. 1993. Investigation of Independence in Interanimal Tumor type Occurrence within the NTP RodentBioassay Database. Report prepared for the National Research Council Committee on Risk Assessment for Hazardous Air Pollutants, Washington, D.C.
Breslow, N.E., N.E. Day, L. Tomatis, and V.S. Turusov. 1974. Associations between tumor types in a largescale carcinogensis study. J. Natl. Cancer Inst. 52:233239.
EPA (U.S. Environmental Protection Agency). 1986. Guidelines for carcinogen risk assessment. Fed. Regist. 51(Sept. 24):3399234003.
EPA (U.S. Environmental Protection Agency). 1987. Risk Assessment Guidelines of 1986. EPA/600/887045. U.S. Environmental Protection Agency, Washington, D.C.
EPA (U.S. Environmental Protection Agency). 1992. Guidelines for exposure assessment. Fed. Regist. 57(May 29):2288822938.
Finkelstein, D.M. and D.A. Schoenfeld. 1989. Analysis of multiple tumor data for a rodent experiment. Biometrics 45:219230.
Haseman, J.K. 1983. Patterns of tumor incidence in twoyear cancer bioassay feeding studies in Fischer rates. Fundam. Appl. Toxicol. 3:19.
Hoel, D.G., and H.E. Walburg, Jr. 1972. Statistical analysis of survival experiments. J. Natl. Cancer Inst. 49:361372.
Lagakos, S.W., and L.M. Ryan. 1985. Statistical analysis of disease onset and lifetime data from tumorigencity experiments. Environ. Health Perspect. 63:211216.
Mantel, N., and W. Haenszel. 1959. Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl. Cancer Inst. 22:719748.
Mitchell, T.J., and B.W. Turnbull. 1990. Detection of associations between diseases in animal carcinogenicity experiments. Biometrics 46:359374.
Portier, C.J., J.C. Hedges and D.G. Hoel. 1986. Agespecific models of mortality and tumor onset for historical control animals in the National Toxicology Program's carcinogenicity experiments. Cancer Res. 46:43724378.
Storer, J.B. 1982. Associations between tumor types in irradiated BALB/c female mice. Radiation Res. 92:396404.
Wahrendorf, J. 1983. Simultaneous analysis of different tumor types in a longterm carcinogenicity study with scheduled sacrifices. J. Natl. Cancer Inst. 70:915921.
Wright, S.P. 1992. Adjusted pvalues for simultaneous inference. Biometrics 48:10051013.
Young, S.S., and C.L. Gries. 1984. Exploration of the negative correlation between proliferative hepatocellular lesions and lymphoma in rats and miceestablishment and implications. Fundam. Appl. Toxicol. 4:632640.
Page 530
Appendix
I3
Aggregation of Uncertainty and Variability
This appendix illustrates why a distinction between uncertainty and interindividual variability within input variates must be maintained, if a quantitative characterization of uncertainty in population risk or in individual risk is sought. Two types of mathematical model used to predict risk are considered here for an exposed population of size n. The first model is a simple one in which a predicted low level of exposurerelated increased risk R is well approximated by the product of U (a purely uncertain variate) and V (a purely heterogeneous variate that models interindividual variability).
where Ui and Vi represent uncertain and heterogeneous variates, respectively, for i= 1,2,3. That is, for a given value of i, Vi models the set of n particular (known or assumed) quantities pertaining to n individuals in the population at risk, whereas Ui models (in this case, using a single, uncertain multiplicative factor) the uncertainty associated with each one of those n quantities; this type of distinction is explained further by Bogen and Spear (1987) and Bogen (1990). In the present simple model, for example, U1 and V1 might refer to lifetime timeweighted average exposure, U2 and V2 to biologically effective dose per unit exposure, and U3 and V3 to cancer ''potency" (increased cancer risk per unit biologically effective dose as dose approaches zero). In this case, V3 would model interindividual variability in susceptibility to doseinduced cancer.
Page 531
A more complicated risk model assumes that risk R equals some more general function H(U,V) of the vectors U and V of purely uncertain and purely heterogeneous variates, respectively. In the following discussion, an overbar denotes the expectation operation with respect to all heterogeneous variates (V) associated with the overbarred quantity and anglebrackets, ‹›, shall denote the expectation operation with respect to all uncertain variates (U) associated with the bracketed quantity (that is, ¯R = EV(R) and ‹R› = EU(R), where E is the expectation operator). Also, FX(x) shall denote the cumulative probability that X ≤ x, for some particular value x of any given variate X.
Population Risk
Population risk, N, is the number of additional cases associated with predicted risk R. By definition, N is an uncertain variate, not a heterogeneous one. Uncertainty in N, however, is often ignored under the assumption that it is necessarily small in relation to the expected value of N for large n. For example, in its recent radionuclideNESHAPS uncertainty analysis, EPA (1989, p. 76) stated that
Because population risks represent the sum of individual risks, uncertainties in the individual risks tend to cancel each other out during the summing process. As a result, the uncertainty in estimates of population risk is smaller than the uncertainty in the estimates of the risks associated with the individual members of the population. Because of this, [our] uncertainty analysis is limited to the uncertainty in risks to an individual.
This assumption is clearly false, as is demonstrated by a comparison of the case (a) of n identical but extremely uncertain individual risks with the case (b) of n identical individual risks all equal to the known constant (i.e., completely certain value) r, for large n. Uncertainty in population risk in case (a) must remain extremely large independent of n, whereas in case (b) the cumulative probability distribution function (cdf) for the ratio N/n is simply a normalized binomial distribution that has smaller and smaller variances around the true value r as n®¥. The key point is that in the relationship between n uncertain individual risks and the corresponding uncertain population risk, many of the uncertain characteristics of each of the individual risks are not independent, but rather reflect quantities such as potencyparameter estimation error or modelspecification error that pertain identically or in much the same way to all individuals at risk, and thus do not in any sense "cancel out" upon summation.
The uncertain magnitude of population risk N (i.e., the predicted number of cases) is well approximated for large n by the uncertainty quantity n ¯R where for the simple risk model ¯R = U¯V and for the more complicated risk model ¯R ≈ H(U,¯V) as a firstorder approximation (Bogen and Spear, 1987). For large n and 0≤j≤n, FN(j) is generally well approximated by the expected Poisson probability for the compoundPoisson variate with uncertain parameter n ¯R; for ex
Page 532
ample FN(0) = ò1/0enrd F¯R (r) (Bogen and Spear, 1987). The expected value, ‹N› = n‹¯R›, of poplation risk has traditionally has been used in defining riskacceptability criteria addressing N; however, criteria intended to be conservative with respect to uncertainty associated with N ought logically to refer to some upper confidence bound on N, rather than to its expected value.
Individual Risk
Predicted risk R, as defined above, is a variate that clearly may reflect both uncertainty and interindividual variability. It is tempting to assume that predicted risk to a given individual—say, the person with the jth highest risk among n at risk (for some j with 1≤j≤n) at some specified level of confidence with respect to uncertainty—might be calculated directly from predicted risk R without distinguishing between uncertain and heterogeneous variates. Indeed, "uncertainty analyses" are often conducted (e.g., see Appendices F and G) in which Monte Carlo techniques are used to approximate F in a way that treats all variates in the same manner, without distinguishing those that are uncertain from those that represent interindividual variability. Except for the trivial case in which n=1, FR(r) calculated in this manner can only be interpreted as the cdf pertaining to risk to an individual sampled at random from the entire population by which FR(r) was developed.
More typically, regulators might be interested in the (uncertain) risk ¯R to an individual who is at average risk relative to others (which is directly related to population risk as described above); more conservatively, interest might lie in the cdf, FR(j)(r), pertaining to (uncertain) risk R(j) = R(qn) to a jth highest or qth quantile (i.e., 100qth percentile with respect to variability, not uncertainty) person at risk, where q = j/n and q might, for example, be some upperbound value such as 0.99. In the most conservative risk assessment, interest is focused on uncertain risk R(n) to the person at greatest risk (q = 1). Clearly, R(j) = R only if all people incur identical (although perhaps uncertain) risks.
When both heterogeneous and uncertain variates are involved in the model used to predict R, the cdf for R(j) might be difficult to calculate. Some possible approaches are discussed below. If all heterogeneous variates are modeled with distributions truncated at the righthand tail, R(n) may be approximated simply by using the maximal values of those variates. Thus, in the simple case, R(n)= UMax(V), and in the complicated case, R(n) ≈ H(U,Max(V)) as a firstorder approximation. If truncated distributions are not used for all heterogeneous variates, in which careful and detailed analysis will be needed. Whether or not truncated distributions are used for all input variables, the approximations will be overconservative, perhaps highly so.
R(j) may be described as a compound orderstatistic, in the sense that the cdf for R(j) has two sources of uncertainty: uncertainty associated with the combined impact of all the uncertain variates used to model R, and the more conventional
Page 533
orderstatistic uncertainty associated with sampling the jth highest individual value of R from among a total of n different (but also uncertain) values where these differences arise from all the heterogeneous (as opposed to the uncertain) variates used to model R. For the simple risk model, assuming V and U are statistically independent, it follows that R(j) = U V(j) where V(j) is itself an orderstatistic and hence is an uncertain quantity that has the following cdf (Kendall and Stuart, 1977):
where Fv(v) is the cdf modeling heterogeneity in V and where I is the incomplete Beta function. In the case of j=n, Fv(n)(v)={Fv(v)}n.
The median value of V(n) is thus the 21/nth quantile (i.e., the 100(21/n)th percentile) of V, which is approximately the {1[Ln(2)/n]}th quantile of V for n›9. The "characteristic" value of V(n) is defined as the [1(1/n)]th quantile of V, which is: the value of V with an "exceedance probability" of 1/n, the value of V expected to be less than or equal to V(n), the 0.368th (i.e., the e1 st) quantile of V(n), and (generally) also the modal or most likely value of V(n) (Ang and Tang, 1984).
For the more complicated risk model, the orderd risk R(itj) for some or all j may not exist in an unambiguous sense because the cdfs characterizing uncertainty, e.g., in risk Rh and Rk for some particular individuals h and k may intersect one another at one or more probability levels other than 0 and 1 (Bogen, 1990). Although it is always possible to estimate the jth highest "upperbound" risk among n such Rvalues (corresponding to n samples from V) all evaluated at some prespecified uncertainty quantile (Bogen, 1990), this approach is generally difficult or impractical to implement by Monte Carol methods for complicated risk models involving both uncertainty and variability. In contrast, it is relatively simple to estimate the jth highest value of expected risk, ‹R›(j); for example for j=n this value is, as noted above, generally most likely to be F1/‹R› (1  n1), where ‹R› ≈ H(‹U›,V) may be used as a firstorder approximation (see Bogen and Spear, 1987). The ratio Pn = [F 1/‹R› (1  n1)]/‹¯R› may thus serve to characterize the magnitude of interindividual variability (or "inequity") in expected individual risks for a population of size n.
Note that unidentifiable persontoperson variability (that is, known values of a quantity that is known to differ among individuals but which values cannot each be assigned to specific individuals) is, for practical purposes, equivalent to pure uncertainty pertaining to those values insofar as the characterization of individual risk is concerned. However, the real distinction between unidentifiable persontoperson variability and true uncertainty is revealed by their different impacts on estimated population risk. In particular, if all other contributions to risk are equal, any positive amount of persontoperson variability in some determinant of risk such as susceptibility—regardless of its identifiability—will
Page 534
always result in a smaller variance (and thus greater certainty) in corresponding estimated population risk than that resulting from an identically distributed risk determinant whose distribution instead reflects pure uncertainty. For example, if two persons face certain but different risks equal to 0 and 1, respectively (regardless of whether it is known who faces which risk), then the expectation and variance of predicated cases are 1 and 0, respectively; here one case will arise with absolute certainty. However, if both persons face a single uncertain risk equal to 0 or 1 with probability 0.5 and 0.5, respectively, then the expected value of predicted cases is again 1, but its variance is in this case 1; here 0, 1, or 2 cases will arise with probability 0.25, 0.5, and 0.25, respectively.
In general, if n persons face n known risks pj = 1,2,…n, having mean E(p)›0 and variance Var(p)›0, then it is well known that, regardless of who faces which particular risk, the expectation and variance of the number N of anticipated cases are E(N) = nE(p) and Var(N) = n{E(p)[1E(p)]  Var(p)}, respectively. Now consider the analogous case in which interindividual variability is replaced by pure uncertainty. In this case, all persons face a common but uncertain risk p that is distributed identically to pj (i.e., Prob(p=pj) = 1/n, j = 1,2,…n) and hence has the same mean E(p) and variance Var(p) (where in this case these moments are with respect to uncertainty, not interindividual variability). For this case, it is straightforward to show that again E(N) = nE(p), but that here Var(N) = n{(p)[1E(p)] + (n1)Var(p)}, which exceeds the previous expression for the variance of N by the quantity n2Var(p).
Summary
In summary, F¯R(r) (characterizing uncertainty in risk to the average person and, approximately, in population risk) and F‹R›(r) (characterizing interindividual variability in expected risk) are both easily estimated, even in cases involving complex risk models with uncertain and interindividually variable parameters. These estimates may generally be sufficient for regulatory decisionmaking purposes seeking to address both uncertainty in population risk and differences in individual risk. For example, suppose riskacceptability criteria were desired to ensure that imposed individual lifetime risks are both de minimis and not grossly inequitable and that 70year population risk is most likely zero cases. An example of corresponding quantitative criteria might be that the relations F¯R(106) › 0.99, pn › 103, and FN(0)‹0.50 should all apply.
References
Ang, A. HS., and W.H. Tang. 1984. Probability Concepts in Engineering Planning and Design, Vol. II: Decision, Risk, and Reliability. New York: John Wiley & Sons.
Bogen, K.T. 1990. Uncertainty in Environmental Health Risk Assessment. New York: Garland Publishing.
Page 535
Bogen, K.T., and R.C. Spear. 1987. Integrating uncertainty and interindividual variability in environmental risk assessment. Risk Anal. 7:427436.
EPA (U.S. Environmental Protection Agency). 1989. Risk Assessments Methodology. Environmental Impact Statements: NESHAPS for Radionuclides. Background Information Document, Volumes 1 and 2. EPA/520/189005 and EPA/520/1890061. Office of Radiation Programs, U.S. Environmental Protection Agency, Washington, D.C.
Kendall, M., and A. Stuart. 1977. The Advanced Theory of Statistics, 4th ed., Vol. 1. New York: Macmillan.