suggested that the correlations result from the selection of data and the methods of analysis themselves and that they do not have a biologic basis. The main intent of this workshop was to investigate the nature of the correlations and what could be concluded from them. The workshop also addressed related correlations between the LD50 (the dose estimated to kill 50% of animals) and measures of carcinogenic potency and between measures of carcinogenic potency in different species. To have a firm basis for its deliberations, the committee solicited information on the definitions and methods of determining the various measures of toxicity and carcinogenic potency involved in the correlations.
In the last few years, testing at the MTD has been criticized as providing too sensitive a screen for carcinogenicity. Approximately half the materials tested to date in studies using the MTD as one of the doses tested have shown statistically significant increases in cancer incidence in one or more of the four sex-species groups usually tested. It has been suggested that part of the putative over sensitivity occurs because testing at the MTD induces carcinogenesis by mechanisms that are likely not to be operable at lower doses. One suggested mechanism is systemic toxicity, which leads to excess cellular proliferation and ultimately to the development of cancer.
The workshop addressed those criticisms of testing at the MTD. Evidence of various mechanisms of cancer induction at the MTD and their importance at lower doses was presented. The rationale for testing at the MTD and methods for estimating the MTD were discussed. The proper way to interpret results obtained at the MTD was also discussed, as well as some alternative methods for selecting the highest dose for a cancer bioassay.
DEFINING AND DETERMINING THE MTD
Eugene McConnell, the introductory speaker, described how the MTD as currently used is determined (McConnell, 1989):
The NCI publication by Sontag et al. in 1976, entitled Guidelines for Carcinogen Bioassay in Small Rodents, became a standard reference. This publication is known particularly for its definition of an MTD: ''The MTD is defined as the highest dose of the test
agent during the chronic study that can be predicted not to alter the animals' longevity from effects other than carcinogenicity." The authors further stated that the dose should be one that "causes no more than a 10% weight decrement, as compared to the appropriate control groups, and does not produce mortality, clinical signs of toxicity, or pathologic lesions (other than those that may be related to a neoplastic response) that would be predicted [in the chronic study] to shorten an animal's natural lifespan." Since that time, Sontag et al.'s definition has been restated and redefined by several groups and authors but remains essentially the same. However, in practice there is a different emphasis. In using these guidelines the primary parameter currently for selecting the MTD is the histopathological appearance, with effects on weight gain being a secondary consideration.
Dr. McConnell emphasized that the estimated MTD (EMTD) is selected on the basis of a 90 day or other prechronic test and involves scientific judgment applied to the information available at the end of the test period. Whether the true MTD was administered can be evaluated only after the bioassay has been conducted.
John Emmerson, speaking from the perspective of pharmaceutical research, stated that the current long-term rodent carcinogenicity studies lack five of the properties that contribute to precision and reliability of the classical new-drug bioassay:
A specific biologic and point, whose attributes have been determined experimentally.
In the carcinogenicity bioassay, one can quantify a response, but the type of tissue affected, the type of tumor seen, the induction time, etc., are not known at the beginning of the study.
The bioassay is basically a discovery process, which in any other experimental procedure would provide the first data that would permit an investigator to ask good questions and formulate hypotheses for testing in followup studies.
A test system in which the investigator can measure a graded response to increasing doses.
The potential for one to obtain good dose-response data is present; without foreknowledge of the potency of the test substance, the potential is not often realized. Most frequently, a response is observed only at the high dose, the MTD.
A reference standard that permis comparison to the test substance and quantification of potency.
This is nonexistent for virtually all new substances that are assayed.
Attempts to define potency are usually feeble and wholly unsatisfactory, because of the tenuous bridges between the effects observed with chemically unrelated substances tested at different times in assays that fail to provide adequate dose-response data.
An experimental design that can be readily tested to confirm the sensitivity of the assay and rule out the presence of unexpected or unknown variables.
The length of the bioassay precludes ready affirmation of the sensitivity of the animals to the end point.
Whether there are unexpected or unknown variables that affect the outcome is recognized only in retrospect.
The ability to define potency in units that have reliable application to humans.
There is no opportunity for planned, experimental confirmation in humans.
Dr. Emmerson proposed that, in addition to the current definition of MTD, there be added the proviso that the dose be selected with a reasonable assurance that the kinetics of systemic exposure and the physiologic response to treatment are quantitatively proportionate and qualitatively similar to those observed in animals to be given lower doses.
Ian Munro suggested that the MTD should be a dose that:
Is adequate to characterize the chronic toxicity of the chemical without inducing overt toxicity that leads to untimely death from effects that would preclude tumor development.
Does not induce gross disturbances in organ function (as determined by clinical and biochemical methods) that would produce a physiologic state incompatible with normal clinical function.
Is chosen with a full understanding of the pharmacokinetics and metabolic profile in relation to dose, so that one knows in advance that rate-limiting mechanisms or qualitative changes in metabolism can be at play in development (this is important in deciphering whether tumorigenic effects are due to secondary mechanisms).
Does not exceed a dose that produces alterations in nutrient intake or use, lest it be of limited relevance to humans.
The first and third points would not lead to estimation of an EMTD different from that obtained with the current approach. The stipulations in the third point advise that the bioassay be initiated with more information gathered in advance.
In reply to a specific question, Dr. Munro said, "I don't think any of us are saying we should not use high dose in testing."
Daniel Krewski presented the major evidence concerning the correlation between EMTD (or, more accurately, highest dose tested, HDT) and an inverse measure of carcinogenic potency, TD50 (the 50% excess tumor-response dose). The more potent the carcinogen, the lower the expected TD50—i.e., more potent carcinogens produce tumors at lower doses than less potent carcinogens. Dr. Krewski's results are presented in Appendix F. Using data derived from the Carcinogenic Potency Data Bank (CPDB) developed by L. Gold and associates, Krewski et al. related the TD50s and the HDT for 191 compounds. The 191 compounds "were selected to satisfy a number of criteria, including the requirements that the experiment have at least two doses in addition to the controls and demonstrate clear evidence of carcinogenicity." Dr. Krewski pointed out that, in some sense, whatever correlations he found would likely be understatements, inasmuch as both the TD50 and the HDT (presumed MTD) would be subject to experimental error. He described a "shrinkage" technique to reduce the overdispersion. However, this technique had only a small effect.
Several dose-response models were used to estimate the TD50 and the Pearson coefficients of correlation between log TD50 and log HDT were computed. The coefficients are given in Table 1 of Appendix F.
Because of the nature of the limitations on the data, if one makes some reasonable assumptions (e.g., the HDT for different carcinogens follows a log-normal distribution, and the TD50 is uniformly distributed about the HDT "within the limits calculated by Bernstein"), the correlation between log TD50 and log MTD is 0.924. That high (theoretical) correlation suggests that the TD50 (hence, the carcinogenic potency for a material established as a carcinogen) could be predicted by the MTD. Krewski et al. cited further investigation on whether the salmonella-microsome assay might be used to predict carcinogenic potency. A statistically significant positive correlation of r = 0.48 was reported and implies that the overall scatter was so great as "to preclude the use of the Ames [salmonella-microsome] assay as a quantitative predictor of carcinogenic potency."
Krewski's conclusion was that preliminary estimates of the low dose cancer risk can be based on an estimate of the MTD. Citing Gaylor (1989), he reported that dividing the MTD by a factor of 380,000 will approximate the 10-6-risk dose if the linearized multistage model is used—without determining whether the material is a carcinogen. On the general issue of TD50-MTD correlations, he says that "correlations between MTD and the TD50 occur as a result of the narrow range of possible potency values within a single experiment in relation to the wide variation observed in the potency of chemical carcinogens." He notes that fact in relation to "suggestions that the observed correlation between the MTD and the TD50 may simply be an artifact of the experiment designs currently used in carcinogen bioassay." He further stated that ''this does not imply that estimates of carcinogenic potency based on bioassay data are not meaningful, but does demonstrate that both the TD50 and q1* [the upper 95% confidence limit linear term in the linearized multistage dose-response curve] represent relatively crude indicators of risk."
Krewski noted that measures other than the EMTD could also be used as predictors of carcinogenic potency—for example, acute toxicity—as measured by the LD50. He also reviewed the data on the correlations of carcinogenic potency among different species and remarked that, in view of the high correlations between potency and the HDT and the wide
range of carcinogenic potency, one should expect to find high correlations between separate species. The predictability from one species to another is "not within a factor of one and a half or two, but may be ten-fold in either direction," which Krewski considers rather good.
Krewski et al. did not reach any firm conclusions. The two ends of the spectrum that they saw were: (1) "Because of all these correlations, some of which may be artifactual, we don't have a good instrument … to assess the human cancer risk." (2) "The limits we are currently using are … statistically as consistent as you can actually get with the experimental data. … From that point of view they are reasonable." (He added that ''it seems that additional information beyond that contained in traditional experiments will be required. … More sensitive indicators of effects at very low doses … may also serve to provide improved estimates of risk in the future.")
The discussants who followed Dr. Krewski were Kenny Crump, Lauren Zeise, Thomas Starr, and Edmund Crouch.
Dr. Crump reported on the correlations between the laboratory animal data and the potency (as measured by a TD25 dose in humans for 23 confirmed carcinogens) (Allen et al., 1988). He and his colleagues found a high correlation of about 0.8. (Different methods of analysis involving different assumptions led to slightly different correlations.) His data showed, qualitatively, "that chemicals that are more highly carcinogenic in animals tend to be also more highly carcinogenic in humans," despite the fact that humans are rarely exposed to a human equivalent of a laboratory MTD, or for a lifetime of exposure. He added that "we did not find … that the estimates currently being made tended to grossly overestimate or underestimate the risks actually estimated directly from the … data."
Dr. Zeise reported on her work with Crouch and Wilson related to the correlations, asking whether the correlations were "real" or artifactual and whether they could be used to predict carcinogenic potency (Zeise et al., 1984; 1985; 1986). She reported that materials tested at doses that caused weight depression early were more likely to be reported as carcinogens than materials that did not. In that regard, she referred to the work of Haseman (1985), who found that about 85% of the carcinogens examined yielded no evidence of cell proliferation in at least one of the tissues where cancers were found.
With respect to the departures from the relationship between HDT
and potency, she found few materials that clearly resulted in low toxicity and high cancer potency. (Such materials would be the so-called supercarcinogens.) Dr. Zeise did note that several materials had low toxicity and produced cancers after a very short exposure; they might therefore be considered supercarcinogens of another kind and had been excluded from her study. Among materials of that type were three benzidine dyes that produced cancers in 13 weeks—after which the experiments were terminated.
Dr. Zeise concluded that the correlations were not completely artifactual and that some analyses on individual animals, considering time to tumor, might improve the estimates. She noted, however, that the data did not answer the question of whether "toxicity is in fact causing cancer."
Dr. Starr recalled an earlier paper of his with Rieth (Rieth and Starr, 1989a) and, by way of summary, reiterated the conclusion of that paper: "We hold the opinion that the chronic rodent bioassay in and of itself is altogether inadequate as the data source for estimating the risk to humans from exposure to carcinogenic chemicals." He also quoted Ames and Gold (1990): "Thus, without studies of the mechanism of carcinogenesis, the fact that a chemical is a carcinogen at the MTD provides no information about low dose risk to humans." He also reported on his more recent work (Rieth and Starr, 1989b) looking at the upper bounds of estimates of carcinogenic potency (by looking at lower bounds of TD50s) and comparing materials that were carcinogenic in both rats and mice, in one species only, or in neither. Dr. Starr found that studies that yielded no evidence of carcinogenicity still yielded evidence of correlation ("nearly as good") between upper-bound estimates of potency and the HDT. The correlation he reported for the so-called negative-negative materials was 0.88, the highest of the four correlations he computed. That and related computations led him to conclude that "it's my opinion, at least, that this business is not giving us much information that is useful at all in quantifying low dose risk.'' Finally, he objected to Krewski's word "measure" and stated that he preferred to use the phrase "crude estimate," noting, among other things, that all estimates are model-dependent.
Dr. Starr's proposals to escape the problems he discussed involve going from an administered dose to a target-tissue dose and moving to a model more like the "two-stage growth-death model of Moolgavkar"
(Moolgavkar and Venzon, 1979; Moolgavkar and Knudson, 1981). They should also involve looking at cell turnover rates in the normal cell compartment and in the initiated-cell compartment.
Dr. Crouch expressed concern about imposing constraints on the data because of mathematical needs or imposing limits in estimation that are not imposed by nature ("nobody imposed that constraint on the animals"). He expressed his belief that the correlations were real and important: "The reason that you're getting the correlations is that nature is telling you something." He also proposed several alternatives, including that the test material (at the doses tested) directly caused DNA damage in cells and that such damage might lead to both acute toxic and carcinogenic effects. However, risk assessment does not require knowledge of causality, but only knowledge of correlations.
The afternoon session was entitled "What are Bioassays Conducted at the MTD Telling Us?" The presenter was Bruce Ames, substituting for his colleague Lois Gold. Dr. Ames's discussion touched on evolutionary issues, cancer as a disease of old age, the mutagenic activity of oxygen radicals, and responses to infection. That led into a discussion of DNA damage and (somatic) mutation, and he noted that "it's hard to mutate a cell unless its dividing." From that and related arguments, Dr. Ames developed the idea that promotion is essentially related to cell division and that what stimulates cell division increases the likelihood of cancer development. He cited the work by Henderson and Preston-Martin (1990), who associated several human cancers with agents "causing a lot of cell proliferation.''
Dr. Ames described how the CPDB was developed, and from there he moved into an argument about the importance of so-called natural carcinogens, of which slightly fewer than half are positive (i.e., are carcinogenic) in one or more species-sex groups. He argued that that is an extremely important finding, because "almost all the chemicals in the world are natural." He pointed out that plant breeders are breeding plants to be insect-resistant, and some of the natural insecticides developed (or increased) in breeding programs might act as carcinogens for humans, although the toxic chemicals in plants tend to be species-specific. He remarked that "we are living in a world of toxic chemicals which come from these plants." In addition, he noted the likely presence of many plant anticarcinogens.
Eugene McConnell, commenting on Dr. Ames's discussion of the
place of cell proliferation in carcinogenesis said that "having looked at between 100 and 200 chemicals in animal bioassays [I find that] … many of these [carcinogenic] chemicals show cell proliferation in organs where the tumors are seen. … I also note that … many chemicals that also cause cell proliferation … are not carcinogens." In reply to a comment by Richard Reitz on the effects of applying risk assessment methods to materials in a common diet, Dr. Ames suggested testing a "random group of nature's pesticides." Dr. Ames (replying to a question by Jill Snowden) again raised the issue of the presence of anticarcinogens in fruits and vegetables.
Returning to the MTD issue, Miriam Davis noted that "90% of chemicals that were carcinogenic at high doses were also producing … tumors at lower doses." That was followed by a discussion of the results of testing at doses lower than the MTD—with some suggestions for testing more food chemicals, but with no recommendations about the number of animals needed to retain tests of satisfactory power. Dr. Ames's major comment was that "when you have a high dose, it's hard to go to a low dose … We have to learn more about mechanisms to predict … [whether something is] a carcinogen."
Michael Gallo gave a history of carcinogenicity testing, including the considerations that enter into the selection by the National Toxicology Program of materials to test. He characterized the current studies as "excellent toxicology" but noted that they were not designed for risk analysis. Risk analysis requires more information than the current bioassays present. Dr. Gallo recommended short-term testing at toxic doses and "then … back[ing] down the dose-response curve and defin[ing] the shape of that … response curve."
David Gaylor discussed the statistical issues and the data that could and could not be added by modifications in the design of the current bioassays (e.g., by adding more low doses). He noted that "persons who do risk estimation [do not] believe … that the number we come up with is in any sense a precise number. … But, apart from the low dose extrapolation, the uncertainty in the data seems to be on the order of a maximum of about 100." That being the case, "bioassays are perhaps getting us in the right ballpark and certainly can be used to rank carcinogens." He noted that low dose extrapolation is more affected by the results at the low doses than at the high dose. Furthermore, the higher the background rate of cancer, the more likely that there will be
a linear term in the fitted dose-response model. Dr. Gaylor invoked the idea of endogenous factors that produce tumors without the addition of chemicals as an argument to demonstrate that low dose effects added to background should increase tumor rates.
In the ensuing discussion stimulated by prepared remarks of Drs. Munro, Wilson, and Engler, issues were raised about the place (and development) of more appropriate biologic models, the relative potency of mutagenic versus nonmutagenic carcinogens, the classification and ranking of carcinogens (based on weight of evidence, rather than potency), and the application of bioassay results to prevent (putative) risks lower than can be measured epidemiologically or with bioassays. The point was made that regulators need be concerned about the expected human exposure (dose) as much as or more than about an absolute measure of potency. A highly potent material to which no one is, or can be, exposed poses no risk. Finally, data were introduced to show the potential for predicting carcinogenicity (for chemicals in well-defined specific classes) by using data from, for example, the salmonella short-term assays. It was noted that, operationally, "60% of the substances that have come to NTP [for testing have come] because of the suspicion of carcinogenicity."
Options for performing bioassays and for using bioassay data in the identification of human carcinogens were extensively discussed. The options are incorporated in a modified form in the main body of this report.