Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
3 QUANTITATIVE CORRELATION BETWEEN MUTAGENICITY AND CARCINOGENICITY The previous chapter reported experimental studies and data reviews on the qualitative relationship between mutagenicity and carcinogenicity. Most of the studies failed to address whether a quantitative correlation existed, but those which did found no significant correlation. This chapter deals with attempts to discover quantitative relationships, with partic- ular attention to the quality and precision of the quantitative data and to the statistical relationships that have been deduced . QUANTITATIVE POTENTIAL OF MUTAGENICITY DATA Ability to extrapolate from chemical mutagens ranked according to the ~ trength of their mutagenicity in simple tes t systems to effects in systems more relevant to man, such as mammals, would have great advantages in risk estimation. The development of useful algorithms will depend, in part, on the use of relevant measures of mutagenic potency and on the reliability (i.e., reproducibility) of those measures. McCann and colleagues24~56~74~75 have begun a thorough statistical examination of estimates of mutagenic potency, using the Salmonella/microsome test as the initial source of data. Others68, 113 have tried to estimate mutagenic potency from assay data by using the entire dose-response curve. However, because chemical toxicity at high doses is not well understood and mutagenic or toxic mechanisms may differ among different chemicals, Bernstein et al.24 used an alternative approach in which the initial portion of the dose-response curve was estimated with the assumption that it is linear. This assumption is valid if single-hit kinetics for mutagenesis and negligible low-dose toxicity exist. Two measures of potency, calculated from their estimates of the initial elope and intercept of the dose-response curve, have been compared56 : the ~ lope of the dose-reeponse curve and the ratio of slope to intercept (an estimate of the spontaneous- 17
revertant background). Salmonella data from the National Cancer Institute/National Toxicology Program were used to show that the initial slope was an appropriate potency measure for one strain (TA100), whereas for three strains (TA98, TAl537, TAl535) the measure incorporating the spontaneous background des Or ibed the da tat In addition to the NCI/NTP data, Salmonella data from the published literature, representing about 2 ,500 experiments,75 were compiled. The Dloo, the done required to increase the number o f revertants by 100 over the spontaneous back- ground,24 was chosen as a measure thought to have a more straightforward biologic interpretation of potency than the elope of induced-mutation frequency. Although there is sub- stantial between-experiment variation in potencies obtained for chemicals under the same conditions, the data ~ t ill tend to cluster. In view of the millionfold range in potency among chemicals in the test, the clustering phenomenon was seen as a hopeful sign in establishing a definitive potency scheme with Sa lmonel i8. McCann and her colleagues75 have suggested that a single short-term test is unlikely to be sufficient to predict carci- nogenic potency. Even for the Salmonella data alone, it will be important-to explore the relevance *a carcinogenic potency of such factors as chemical class specificity, species-specific activation, and the effect of induction on S-9 homogenates. The McCann group has also begun a comparison of the acti- vation potential of rat, mouse, and hamster S-9 liver prepara- tions. Initial results suggest that, because the variance of potencies estimated from replicate experiments under the same conditions is high, relatively large differences (up to a factor of 30 or 50) are required to be detected as significant. In it ~ original report ,84 the Committee discussed the desirability of performing such quantitative exercises with mutagenicity data from a variety of test systems. The Committee stated: "Extrapolation from experimental organisms to man has considerable justification as a qualitative measure, but quantitative extrapolation is uncertain. There is neither suf ficient uni formity among sys tems nor su f ficient basic knowledge for quantitative extrapolation to human mutation. Even closely related species differ substantially in metabolism of mutagens and promutagens and in repair capacity." The Committee's lack of confidence in quantitative mutagen--- icity data was not absolute, however. If strict adherence to protocols were maintained, it would be possible to rank muta- genicity data within any single test system. Because the range of activity for most tests encompasses several log units, the Committee proposed to classify chemicals within a test accord- ing to a logarithmic scale; each log unit would constitute a category. A mutagen could then be described in a semiquanti- tative, relative way by listing the categories among different test systems. It is important to reiterate that mutagenic 18
potency was calculated for individual tests and that calcu- lations for different tests should not be numerically come Lined. Because any chemical may have different results in different tests, this scheme does not permit anything more than a qualitative comparison of groupings among tests. A formal description of the Co~ittee's mutagenicity potency scheme follows: Suppose that the least potent chemical in a given test has a value of X, which is in the range lOk-lOk ~ 1 units. Each chemical examined in this test is assigned to one of the fol lowing potency groups: Group I: potency values between lok and 1ok + 1 Group 2: potency values between lok ~ 1 and lok ~ 2 Group 3: potency values between 1ok + 2 and 1ok ~ 3, and so forth until the highest observed value is reached. The Committee applied this scheme to data from two large sources: a study of 300 chemicals with the Salmonella/microsome test73 and the GENE-TOX literature review on mutagenesis in V79 Chinese hamster cells.27 Both sets of data spanned eight log units. Most mutagens were moderately active in both tests. For the 23 chemicals that were tested in both systems, there was a rough linear correlation. This is an encouraging result, but must be further developed. As part of a program on comparative chemical mutagenicity, C1 ive38 performed an exhaustive analys is of mutagenic potency. Clive used data that had been gathered in the compar- a tive-mutagenicity program and arranged them logarithmically within one test. His rationale for the arrangement was similar to that of the Committee--the large variation of a chemical in the lowest ef fective concentration that is mutagenic in differ- ent test systems. He constructed a mutagenic scale that was normalized to this concentration for each system. Clive's geometric categorization was within five levels for each test on the scale of the particular test; the width of each category was: P =W~, where HMD and LMD are the highest and lowest mutagenic doses, respectively. Each of the five risk segments is bounded by the expressions: LMD; P(LMD); p2(LMD); P3(LMD) P4(LMD) P5(LMD) = HMD. For each chemical, a mean risk factor was calculated over all mutagenicity test systems to represent the overall activity of a given chemical. The Committee rejected this extension of relative categorization on the grounds that the use of an average value obscures the great variation in data among tests 1 C)
and removes the ability to analyze data separately. Also, because the Committee considered strictly mutagenic end points, it did not include Clive's emphasis on in viva mammalian data on chromosomal aberration in its scheme. Another factor in the Clive evaluation was the consideration of carcinogenicity data, most of which were obtained from IARC.58 Clive included oncogenicity data in the average risk correlations. In conclusion, both this Committee and Clive constructed systems determined by a semiquantitative ranking of chemicals. Both tested their systems with experimental data. It is dif- ficult to know whether these rank correlations can be extended to the in viva systems. Zimmermannl20 summarized the general problem of extrep- olation among systems as one of metabolism--the metabolism of xenobiotic chemicals and the metabolism of DNA. The metabolism of compounds involves the competing processes of activation and detoxification. Physiologic complexities in an organism mask or create mutagenic activities of chemicals in ways that may be peculiar to the organism. The primary target of a mutagen may not always be DNA, but may be a molecular target, such as an enzyme of DNA metabolism, spindle-fiber apparatus, or even a membrane. Because of the influence of these complications in calculating genetic risk, Zimmermann concluded that the muta- genicity of a given chemical is not an intrinsic property of the material, but only a potential that is expressed in combi- nation with biologic characteristics. QUANTITATIVE ~ In animal cancer tests, whole mammals, usually rodents, are used as surrogates for man. Because the biologic end point-- cancer--is the same, extrapolation to human risk can be used and adjusted for differences in life span, physiology, suscep- tibility, and other factors. It has become a regulatory prin- ciple that, without human data, only data from animal bioassays are acceptable as definitive evidence of carcinogenicity. However, cancer bioassays are not simple feeding experi- ments, and problems of interpretation can arise in bioassays. For example, because the number of animals in an experiment is limited, the highest tolerated doses are often chosen so that the probability of inducing tumors will be as high as possible; but this increases the difficulty of extrapolating to the risk associated with low doses and interpreting the influence of high doses on metabolism of the test chemical, premature mom tality, and fitness of the animals. To ensure that experiments follow proper protocols, the National Cancer Institute and the International Agency for Research on Cancer have prepared guidelines for conducting and interpreting carcinogenicity bioassays . 20
There are some indications that, even among the high ~ tandards set by the two agencies c ited above, data are not necessarily accurate. For example, Science has reported flaws in experimental des ign and laboratory pert ormance in the NC I bioassay program, and the results of animal experiments are often controversial.~09 Qualitatively, the correlation of bioassay data between rodent species appears s bong . Purchase98 analyzed rat and mouse data for 250 chemicals from experiments conducted before 1979. Of these compounds, 44: were carcinogenic and 38% were noncarcinogenic in both rats and mice. Some 64% of the chemi- cale produced cancer at the same site in both species. Pur- chase suggested that a chemical positive in one species has about an 85X chance of being positive in a second species. Several evaluations of cancer data for quantitative potency are underway. Perhaps the most ambitious effort is that of Ames and colleagues.2~5~50~55 The purposes of their program are to search the literature for acceptable data for making potency estimates and developing a proper data base, to calcu- late a measure of potency, and to analyze sources of variation. The Ames group defined as a potency index the tumorigenic dose (TD,o)--the daily dose rate required to decrease by half the probability of an animal' ~ remaining tumor-free at the end of a standard lifetime test (104 weeks for rats and mice). This index is computed so as to account for spontaneous tumor incidence in control animals and for intercurrent mortality. The data for the index must meet the following criteria: (1) exposure must occur chronically over at least half the animals' normal life span, (2) the route of exposure must be analogous to that of an important human exposure (i.e., diet, gavage, wa~cer-drinking, or inhalation), (3) exposure must be to the whole body rather than specific sites, and (4) appropriate controls must be concurrently included in the experiment. Although Ames reported that over 1,200 experiments with 250 chemicals met these criteria and their results were entered into his data base, only a brief preliminary analysis has been published.2 In this analysis, a potency range of a factor of 107 was observed. TDso values for 18 chemicals that defined this range were depicted. The range in carcinogenic potency that Ames described is very similar to that noted by Fishbein47 a few years ago. A more formal statistical evaluation of TD,o and a detailed numerical defense will soon be published.50~07 The proposed TD,o index of carcinogenic potency constitutes ~ refinement of precarious attempts, in that the incidences of spontaneous tumors and intercurrent mortality were accounted fore Crouch and Witson42 have formulated an estimate of cam cinogenic potency. At low doses, the dose~response curve is inear and de f ined by 21
p a a. Ed for a>O, ~ > 0, where P is the lifetime probability of an animal's getting cancer, a is the spontaneous cancer incidence, d is the dose of carcinogen, and ~ is the potency of the carcinogen [kg~d/mg)~. At high doses, the dose-response curve is P = 1 - (1 - aJexp Fd/~1 - a) for d << (1 - a)/~. Approximately 90 studies from the NCI Carcinogenesis Bioassay Reports were analyzed, and the carcinogenic potencies of chemicals were compared among both sexes of two rodent species. For many NCI data, a linear correlation deviating by about a factor of TO was found between the sexes of the two species. The interepecies correlat ion of potencies was al so good (Osborne Mendel rat versus B6C3~1 mouse; Fischer 344 rat versus B6C3F1 mouse). The authors argued that experimental carcin' genesis in one species can be used to estimate potency in another species by determining an "interepecies relative sensi- tivity factor. " Crouch and Wilson also attempted to correlate the potencies of chemicals between human and experimental cancers. Cancer ~ ~ ~ . uncle ence In man was given by dN/d t = ~ tK, where dN/d t is the cancer incidence, t is the e lapsed t ime, and K is a function of the cancer site. Data from several sources were used, including a National Research Council report85 and Meselson and Russel1.77 Although the data did not correlate well, an interspecies sensitivity ratio between man:mouse and men: ret was calculated to be about 5:1. The model of Crouch and Wilson must be subjected to more data. The results of many more carcinogenicity bioassays in which interspecies comparisons are possible should be examined, because the results of interepecies carcinogenicity testing sometimes fail to agree, let alone to correlate numerically (e.g., Ames and McCann6 and Squirelil). Also, human carcinogenicity data are probably insufficient for accurate calculations of any human sensitivity factors. Other biologic criteria for modifying carcinogenic potency - estimates have been suggested. Squirel1 considered a number of species, and number of sites affected, latency periods, dose~response relationship, and severity of the induced lesions. Thus, according to Squire, the most potent carcinoma yens induce primarily malignant tumors, at multiple sites, in a short period, at low doses, and in both sexes of several species. Although not explicitly designed to assess independ- ently carcinogenic potency, the IARO criteria for evaluating 22
carcinoger~icity in experimental animale58 have been used in part by Squire and others to formulate care inogenic-potency schemes. IARC classifies animal carcinogenicity data as "sufficient, limited, or inadequate." Sufficient evidence indicates an increased incidence of malignant tumors in mul- tiple species, in multiple experiments (preferably with doff ferent routes of administration or doses), or to an unusual degree with regard to incidence, site, or type of tumor or age of onset. Limited evidence, which suggests a carcinogenic effect, is limited in that the studies involve a single species, strain, or experiment; inadequate dosage, exposure, duration, or followup period was used, there was poor survival, or too few animals were used; or the neoplasms often occur spontaneously or had been difficult to classify as malignant by histologic criteria. Inadequate evidence was defined as having major qualitative or quantitative limitations that prohibit finding of a carcinogenic effect or as yielding, within limits of the test, a conclusion that the chemical is not carcinogenic. The value of carcinogenic potency relative to other factors used to assess the carcinogenicity of a chemical may be exert plified by the carcinogenicity ranking scheme of Squire.ll2 Six factors are used to rank animal carcinogens, and five are given equal importance. The five categories are potency (according to dose-response data), the number of species affected, types of neoplasms, spontaneous tumor incidence in the control group, and malignancy of the neoplasm. Given slightly more weight (a maximum of 25: of the highest possible total score) are positive results in a battery of genotoxicity tests. Squire was cautious on the exclusive use of mathematical models to describe biologic events, particularly carcino- genesis. He stated that, "for the same animal data, different models may predict levels of risk that vary widely, indicating the potential error involved in estimating carcinogenic potency or human cancer risks by such methods." To illustrate this point, Squire referred to the controversy surrounding the testing of saccharin.83 This warning about treating in a too nearly mathematical way biologic phenomena whose mechanisms are incompletely understood underscores an uncertainty about calculations of carcinogenic potency. As a concept, potency may be the key component in assessing risk; in practice, uncertainties about experiments and their general applicability to biologic facets of carcinogenesis suggest that potency calculations should be only a par t o f an overal ~ cancer risk es timation. 23
DIRECT QUANTITATIVE CORRELATION OF MUTAGENIC AND CARCINOGENIC POTENCIES Short-term tests for mutagenicity were developed to produce an experimental surrogate for animal cancer bioassays. An early study by Meselson and Russell77 reported a high posi- tive correlation between mutagenic and carcinogenic potencies for a limited number of chemicals, but these preliminary findings have not been sufficiently supported by later data. The results of Meselson and Russell were based on 14 chemicals that had been tested by McCann et al.73 for muta- genicity and examined by IARC for carcinogenicity in 1972- 1974. Carcinogenic potency was defined as a TD,o in rodent bioassays, and mutagenic potency as the reciprocal of the number of micrograms of a chemical that produced 100 revertants in the Salmonella/microsome test. The range of mutagenic and care inogenic activities for the 14 chemicals was a factor of 105. Meselson and Russell found a high correlation, except for several nitroso compounds. A~hby and Stylesl3, ]4 described the potential implica- tions of the-above findings as momentous--and then discussed the d i f f icult ies o f ob taining a de f inn' ive resul t . Like So Dire in his carcinogen ranking scheme,ll2 they pointed to other biologic factore--such as metabolic activation, absorption, and chemica ~ hat f-1 if e--a s equa 1 in importance to potency (expressed as a dose-reeponse relationship) in assessing car- cinogenic risk. Although Ashby and Styles supported the quali- tative value of in vitro short-term testing, they cautioned against making quantitative correlations. They discussed the enzymatic variability of S9 microsomal preparations, which may be the major source of variability among laboratories in results of the Salmonella/microsome test. Differences in car- cinogenic potency between animal species were also discussed, especially in regard to the general lack of concordance in S9 enzymatic activity and whole-animal tumor susceptibility. Ames and Hooper4 responded that linear correlation between mutagenic and carcinogenic potencies may depend on the particular chemical and its interaction with a particular organism. They acknowledged the exploratory nature of Meseison and Russell's findings. Most important, they agreed that, because the Salmonella/microsome test was designed for maximal sensitivity, varying the test parameters will change the apparent potency; this was suggested as one reason not to expect the correlation to be precise or to hold for every chemical clas ~ . The work of Melon and Russell is not the only one to assert a quantitative relationship between mutagenicity and _ ~ 60 studied 26 nitrosamines in a carcinogenicity. Jones 24 - in, . . .
V79 Chinese hamster cell system cocultivated with primary rat hepatocytes. For the index used and the materials tested, a linear relationship between mutagenicity and carcinogenicity was established (E:~ 0.0001~. The authors suggested that the V79 cell system could complement the Salmonella system. Reporting on another rodent cell culture system, the L517BY/TK+/- mouse lymph oma assay, Clive et al.39 found an approximately linear relationship between oncogenic and muta- genic potencies for 25 chemicals over a potency range of a factor of 105. They found that the 100-fold variation in activity for both end points was a consequence more of dosage differences among chemicals than of biologic effects. However, these findings were tempered with the caution that the per- ceived correlations were between biologically active doses and that the true correlation may be related to cytotoxicity, rather than genotoxicity. Yet, the authors concluded that these correlations~serve as a rough estimate of carcinogenic potency. A parenthetical observation may help to explain some of the differences seen when using Salmonella is used" rather than mammalian cells in culture. Scribner et al. 10 have shown that, if bacterial data are adjusted for a molecular size or partition factor for a given chemical with carcinogenic activ- ity, the correlation approaches that found with the Y79 Chinese hamster system. This, of course, leads to the speculation that differences in membrane permeability or target access may explain the poor correlation between bacterial and mammalian cells when a broad range of compounds is considered. Recent potency studies of other classes of compounds have been conducted with the Salmonella/microsome test. Studying 4-dimethylaminobenzene derivatives, Ashb`~ al.l2 found no quant itative correlation. Parodi et al . compared the cor- relation of DNA binding and DNA fragmentation with care ino- genicity and mutagenicity for 21 compounds. From the limited data, a s t at is t ical analys is indicated a correlat ion be tween DNA binding and carcinogenicity and a weaker correlation between bacterial mutagenicity and carcinogenicity. Hoel et al.5 have analyzed mathematically the relationship between DNA adduc t formation and carcinogenes i ~ . Tumor response could be linearly correlated with the concentration of DNA adducts in the target organs, and the kinetic processes ire DNA adduct formation by care inogens are implicated in the nonlinearities in the dose~esponse curve for tumor frequency when they occur. Perhaps the most definitive statement on the quantitative relationship between mutagenicity and carcinogenicity is the recent review of Bar~csch, Tomatis, and Malaveille21 on the basis of IARC carcinogenicity reports. Their fires conclusion was a confirmation of the qualitative association between the two phenomena. Their examination concentrated on the 30 chemi- cals on which human carcinogenicity data are most compelling. 25
For 130 chemicals on which evidence of experimental carcino- genicity is "sufficient," the qualitative association is equally strong. Four reasons were given fa r concluding that data are insufficient to establish whether there is a quanti- tative correlation. First, a universally accepted index of carcinogenic potency has not yet been defined. TD,o has been proposed by Ames and colleagues as a standard index and was used in the Meselson and Russell77 correlative study, but the scientific community has not yet accepted this definition. Second, few correlation studies have been made, and major debate and contradiction remain unresolved. Third, care ino- genicity indexes for experimental animals are rare and do not cover a representative number of chemical c lasses. Fourth, in regard to man, epidemiologic studies To not include precise dose-response data. 26