OCR for page 206
sometimes accompany changes in crown-rump length or skeletal ossification. Altered growth can occur at any stage of development, and it can be reversible in some cases or permanent in others. Most current study designs do not allow differentiation between reversible and permanent changes.
Functional developmental toxicity is the study of alterations or delays in the physiological or biochemical competence of an organism or organ system after exposure to an agent during pre- or postnatal development. In any given test animal, delayed development can be assessed in relation to established landmarks for physical, behavioral, and sexual maturation.
Types of Studies
Two types of studies specifically designed to assess developmental toxicity are discussed in this section: the prenatal developmental toxicity study and the developmental neurotoxicity study. Several other types of studies, although not solely designed to assess developmental toxicity, can be used for that purpose. They include single- and multigeneration reproduction studies, reproductive assessment by continuous-breeding studies, and serial mating (dominant lethal) studies discussed in later sections.
Prenatal Developmental Toxicity
The prenatal developmental toxicity study provides information on the effects of repeated exposure to an agent during pregnancy (OECD 2000a; EPA 1998a; FDA 2000). It is normally conducted in two species, a rodent (usually rat) and a nonrodent (usually rabbit), although not all guidelines specify nonrodents. Animals are exposed to an agent, usually via ingestion or inhalation, during the period of major organogenesis. The protocols include exposure to the end of gestation in order to cover developmental events that occur later in gestation (e.g., central nervous system, skeletal growth, sexual differentiation). Offspring are delivered by cesarean section on the day before the expected day of parturition, and a maternal necropsy is conducted, including examination of the uterus for number of implantations, resorptions,
OCR for page 206
fetal deaths, and live fetuses. Corpora lutea in the ovaries are also counted. Live fetuses are weighed and examined carefully for external, visceral, and skeletal malformations and variations. Although the terminology used for malformations and variations has been variable from laboratory to laboratory, attempts have been made at standardization (Wise et al. 1997).
The objective of developmental neurotoxicity studies is to assess the potential of an agent to affect neurodevelopment (EPA 1998c). The protocol is designed to be used either as a separate study, usually as a follow up to other studies, or as part of a multigeneration reproduction study. A test agent is administered at a minimum of three dose levels to pregnant animals in groups that are large enough to produce 20 litters per dose group from day 6 of gestation through day 10 postnatally (the first half of lactation). (This is the minimum exposure period. Dosing can be continued throughout lactation or, in the context of a multigeneration study, dosing is done daily over two generations.) Pregnant and lactating dams are assessed for clinical signs of neurodevelopmental effects and for their performance in a functional observation battery. Litter sizes can be adjusted by random selection to provide equal numbers of male and female offspring (usually four of each). Offspring are randomly selected from litters for neurotoxicity evaluation, including gross neurological and behavioral disorders, motor activity, response to auditory startle, learning and memory, brain weight, and neuropathological examination. Motor activity is studied on postnatal days 13, 17, 21, and 60 ± 2. Auditory startle tests are conducted on postnatal days 22 and 60 ± 2. Learning and memory are evaluated in the offspring around the time of weaning (postnatal day 21) and again in adulthood (postnatal day 60 ± 2). Neuropathology is examined in the offspring on postnatal day 11 and at the termination of the study. The neuropathology analysis includes simple morphometric measurements of brain areas.
Although these studies are designed to specifically assess the effects of developmental exposures on nervous system structure and function, they are limited in the extent to which this complex system can be evaluated as part of routine testing. For example, assessment of
OCR for page 206
social and reproductive behavior and condition (such as anxiety) are not included, different types of learning and memory–such as spatial and sequential learning, reference and working memory, or the effects of recall delay–are not assessed, and long-term effects of developmental exposures (beyond 60 days (d)) are not evaluated. Several efforts are under way to evaluate the utility of such protocols and to improve the methods used in rodent studies so that they are more comparable to those used in humans.
In Vitro Assays
Any developmental toxicity assay that uses a test subject other than a pregnant mammal falls under the general heading of an “in vitro assay.” Examples include isolated whole mammalian embryos in culture, nonmammalian embryo culture, and tissue, organ, and cell culture. Several manipulations are possible using in vitro assay systems that are not possible using pregnant mammals, such as the removal of the maternal environment, the removal or transplantation of specific tissues and cells, and the ability to track specific cells and molecules, to genetically alter cells, or to monitor embryo physiology.
There are two potential applications for in vitro assays: screening for developmental toxicity and analyzing mechanisms of normal and abnormal development. In vitro assays to screen chemicals for potential developmental toxicity have been under development for approximately 15 years with the idea that they could be used to assess larger numbers of chemicals than can be evaluated with in vivo developmental toxicity tests in mammals, could reduce the number of experimental animals used in those tests, and could be used to reduce the costs of testing large numbers of chemicals. A number of attempts have been made to validate in vitro assays for screening chemicals, and efforts are under way to validate the rodent embryo culture, micromass, and stem cell assays in a European-sponsored trial (Spielmann et al. 1998), and the frog embryo teratogenicity assay (Xenopus) in an interlaboratory comparison (Fort et al. 1998). Validation requires certain considerations in study design, including defined endpoints for toxicity, an understanding of the procedure's ability to respond to chemicals that require metabolic activation, and the accuracy of the test's response to chemicals that cause developmental toxicity or no effect in whole
OCR for page 206
animal studies (Kimmel et al. 1982; Kimmel and Kochhar 1990; Schwetz et al. 1991). Since most in vitro systems involve an interruption in normal metabolism and the biological interrelationships found in the intact system, the range of developmental effects that can be produced and the power of the study to detect an effect are compromised as compared to those obtained using standard study designs in whole animal systems (Kimmel 1990). For these and other reasons, in vitro developmental toxicity assays are unlikely to be used alone to screen chemicals for risk assessment purposes when there is no prior knowledge about the potential for developmental toxicity. In the case of priority-setting in early drug or chemical development, such assays may be useful for eliminating those with toxicity that can be detected in these systems, leading to further development of those with little or no toxicity, with the expectation that standard in vivo assays would be conducted before actual marketing. In vitro screens also may be useful for assessing the developmental toxicity of chemicals or chemical classes for which there is already some information about toxicity from in vivo studies for the purpose of describing the relative toxicity (potency) of members of chemical families. If chemicals are likely to act through a common mechanism, a Single in vitro screen that is sensitive to a particular mechanism may predict the relative potencies of a class of chemicals. For example, an in vitro mouse limb bud cell screen has been used successfully to rank the relative teratogenic potential of a large series of synthetic retinoids (Kistler 1987). In addition, in vitro assays may be useful for studying complex mixtures for synergism or antagonism, and for evaluating the cumulative risk of two or more chemicals that have similar mechanisms or effects.
In vitro assays have become widely used for mechanistic studies in developmental toxicology (Harris 1997). An advantage to using in vitro assays for such studies is that they utilize decreasing levels of biological complexity to isolate specific developmental processes. In vitro assays are useful for identification of tissue sites of accumulation, initial biochemical insults, gene expression changes, structure-activity relationships, and disrupted developmental pathways. It is important to link the information developed in these assays to the whole tissue and organism events that are seen as a result of developmental toxicity in order to be most useful for risk assessment purposes. Such information can be employed in developing biologically based dose-response models for developmental toxicity (e.g., Shuey et al. 1994).
OCR for page 206
Box D-1 lists endpoints that can be used to assess developmental toxicity from standard testing studies. EPA (1991) published guidelines for developmental toxicity risk assessment that provide more detailed discussion of study result interpretation.
Observations on dams during the course of a study include regular examination for signs of toxicity and measurements of body weight. Assessment of food and water intake also can indicate toxicity and is essential to calculate actual test substance intake when the substance is administered in the diet or in drinking-water. When an agent is known to produce pharmacological or toxic effects, including sedation, respiratory depression, or hemolysis, such endpoints also are monitored. Maternal observations assess the relative contribution of maternal toxicity to any embryo-fetal toxicity observed. Maternal body weight before and after removal of the gravid uterus allows the determination of toxicity to the mother exclusive of effects on uterine content.
Examination of the uterus and its contents and of the ovaries of animals that are killed before parturition allows determination of the number of corpora lutea (a measure of the number of eggs released); implantations; live, dead, and resorbing fetuses; fetal weight; and sex. The number of implantation sites equals the number of live fetuses plus the number of dead embryos and dead fetuses. Preimplantation loss can be determined by subtracting the number of implantation sites from the number of corpora lutea. It is possible that the treatment can prevent implantation, and caution should be applied when interpreting the number of implantation sites and preimplantation loss. Dividing the number of resorptions (embryonic deaths) by the total number of implants gives a measure of postimplantation loss, subject to the same caution as above. It should be noted that postimplantation loss is sometimes expressed inclusive of fetal deaths. Uteri that show no signs of implantation at all can be stained with ammonium sulfide to reveal completely resorbed implantation sites (Salewski 1964).
Viable fetuses are examined for external, visceral, and skeletal malformations and variations, and the sex is determined. Individual fetal weight and identification allow external, visceral or skeletal
OCR for page 206
not provide information on effects in the second generation unless F1 pups are raised to breeding age and mated to produce a second generation as in the multigeneration study design. Other limitations are noted in
Serial Mating Study (Dominant Lethal Study)
If a single-mating trial results in an adverse effect attributable to the male, it is difficult to determine the developmental stage in which the disruption occurs. It is well known that different stages of spermatogenesis are variably sensitive to toxic effects and that each toxic substance can affect different sperm cell populations (Parvinen 1982). Spermatogonia, for example, are sensitive to cyclophosphamide in experiments conducted in mice (Toppari et al. 1990), whereas spermatocytes are disrupted by ethylene glycol monomethyl ether in experiments conducted in rats (Chapin et al. 1985). The action of a compound that primarily affects the somatic Sertoli cells of the testis, for example, m-dinitrobenzene (Foster 1989), will produce an extensive period of infertility because of adverse effects on the function of these cells at various stages of germ cell differentiation.
Serial mating makes it possible to assess the sensitive stages of spermatogenesis and susceptible cell types. This information can be obtained from a specific serial-mating trial or from a similar protocol used for dominant lethal testing (OECD 1984; EPA 1998d). Adult males (usually rats) are exposed before mating, typically for 1-5 d, with 20 males per dose group, where after they are mated to one to three females weekly for the next 8-10 wk. Adverse effects on male reproduction are manifested as decreased numbers of implantation sites in uteri (indicative of failure of fertilization or preimplantation loss) and increased early fetal mortality (indicative of postimplantation loss or dominant lethality). To examine the uterine contents, dams are killed before parturition (e.g., on days 13-18 of pregnancy).
Any adverse effects can then be attributed to specific cell populations by back-calculation on the basis of the well-known kinetics of spermatogenesis (Chapin et al. 1985). The test was originally designed for detection of germ cell mutagenicity, and it requires a large number of female animals (e.g., an 8-wk trial would use 160-480 females), which is a disadvantage.
OCR for page 206
The limitations of serial-mating trials are similar to those shown in
Table D-1 for other reproduction studies, except for the identification of stage of spermatogenesis affected. Additional endpoints of male reproductive toxicity and effects other than death of the offspring are not evaluated unless included in the protocol.
Total Reproductive Capacity
The total reproductive capacity study, a variant of the continuousbreeding study, is designed to assess ovarian toxicity. Female fetuses are particularly susceptible to agents that can adversely affect germ cells because development of the oocyte occurs prenatally; no new germ cells develop after birth. Female animals are exposed to a test substance for a short period in utero (i.e., days 9-16 of gestation) (McLachlan et al. 1981) or postnatally (Generoso et al. 1971) and allowed to mate with a single male as long as the females remain fertile. The numbers of litters and offspring are compared with those of control animals to estimate the loss of oocytes resulting from the exposure.
Total reproductive capacity studies have been designed with the specific purpose of evaluating female reproductive capacity and are not tests of general reproduction function.
Well-conducted multigeneration and continuous-breeding studies can provide data that demonstrate changes in the key parameters of male and female fertility and reproduction. Statistically significant, dose-related changes in the indices listed in
Table D-2 provide sufficient evidence of reproductive toxicity but by themselves do not identify the affected sex. Because most multigeneration or continuousbreeding studies place test males with females treated at the same dose, they cannot identify which sex is affected. Although such studies are the most typical way to evaluate the reproductive toxicity of an agent, most provide insufficient evidence of whether the agent causes male or female reproductive toxicity in animals. There is, therefore, a need for additional data, which, in fact, can come from the same study. For example, evidence of gonadal toxicity measured by testicular
OCR for page 206
TABLE D-2 Indices of Fertility and Reproductive Function
~ enlarge ~
weight or altered morphology can provide sufficient evidence that an agent is a male reproductive toxicant or add weight to evidence that it is not a male reproductive toxicant. Likewise for females, evidence of ovarian toxicity measured by weight changes and altered morphology can provide sufficient evidence for female reproductive toxicity. Another way to provide sufficient evidence of male reproductive
OCR for page 206
toxicity would be to mate the treated animal of one sex to the untreated animal of the other sex.
A statistically significant, dose-related decrease in absolute or relative testicular weight is generally sufficient evidence that an agent can cause reproductive toxicity in animals. Most agents that cause testicular toxicity also cause decreases in testicular weight, but if they cause edema, the testicular weight increases. Decreases in testicular weight can be considered sufficient evidence of toxicity by themselves, but increases must be explained by other endpoints, such as morphology. Any changes also must be considered in light of the systemic toxicity elicited by the test chemical. Severe systemic toxicity brings into question not only the organ weight data, but also the relevance of any other reproductive effects.
Weight changes in male accessory sex organs can indicate significant functional effects. Both the seminal vesicles and the prostate, for example, contain a large proportion of luminal fluid that can decrease rapidly when androgenic hormone concentrations decline. Epididymal weight is largely affected by the number of sperm present in the epididymis. Statistically significant, dose-related decreases in the weight of the epididymis would be sufficient evidence of male effects. Decreases in the weight of the seminal vesicles or ventral prostate can be sufficient evidence of male reproductive toxicity, but are more useful if supplemented by data on endocrine effects. Changes in pituitary weight alone would typically be insufficient evidence of male reproductive toxicity, because pituitary weight is an inaccurate indicator of changes in pituitary function, which are best measured by other parameters, such as hormone concentrations. Furthermore, only a small portion of the gland is involved with reproductive function.
Changes in testicular morphology are best observed when the tissues are preserved by optimal methods. The best evaluations can be
OCR for page 206
done on testes fixed by perfusion and embedded in a plastic, such as glycol methacrylate. More conventional, but still quite acceptable, morphologic investigations can be performed on testes fixed by immersion in Bouin's fixative, embedded in paraffin, and stained with PAS. Formalin fixation and paraffin embedding of testes is an inferior and generally inadequate method for the study of testicular pathology because it will reveal only the most severe effects. In formalin-fixed and paraffin-embedded tissues, only the most severe changes in the seminiferous epithelium of the testis could be considered sufficient evidence of male effects. The sensitivity of these evaluations can be substantially improved by more careful fixation, embedding, and observation techniques. Low-quality morphological techniques, such as formalin fixation and paraffin embedding, are never sufficient to show that an exposure did not produce testicular toxicity.
Morphological changes in accessory sex organs are less common, but clear treatment-related effects also can provide sufficient evidence of male effects.
Fertility studies do not incorporate measures of sexual behavior, but they indirectly measure endpoints that can be altered by effects on sexual behavior. These measurements include collecting vaginal smears to check for the presence of sperm or checking vaginal plugs as evidence of mating. An azospermic male, however, might have normal sexual behavior but will not have a “sperm-positive” mating. Thus, even though a decrease in sperm-positive matings can be sufficient evidence of reproductive toxicity, it would not be sufficient evidence of abnormal sexual behavior. If a study does measure sexual behavior, mounting frequency, intromission, ejaculation number, and latency can be measured. More detailed studies of sexual behavior (Zenick and Clegg 1989) would be helpful, but are rarely done.
In mice and rats, sperm motility and count are relatively sensitive and reliable indicators of male reproductive toxicity (Morrissey et al. 1988a,b). Statistically significant, dose-related decreases in these
OCR for page 206
parameters would constitute sufficient evidence of male reproductive toxicity, even if fertility is not adversely affected. Sperm morphology changes, if statistically significant and dose-related, would be sufficient evidence of reproductive toxicity. Experience has shown, however, that sperm morphology changes in rodents are fairly insensitive indicators of reproductive toxicity (Morrissey et al. 1988a,b) even though they can be good indicators of reproductive dysfunction in humans.
Sperm evaluations in rats and mice are nearly always limited to the terminal sacrifice of the test animals because it is extremely difficult to collect semen samples from such small animals. Because investigators can collect whole semen samples from rabbits and domestic animals, however, it is possible to assess and follow progressive changes in semen in these animals over time. The potential advantages to conducting sperm assessments in rabbits include the ability to assess the same parameters (morphology, motility, sperm count) at successive points. Studies have shown that large decreases in semen parameters must occur before there are noticeable changes in fertility. Statistically significant, dose-related decreases in semen quality, however, could constitute sufficient evidence that an exposure causes reproductive effects in the test species.
If adequately designed studies detect changes in concentrations of gonadal steroid or gonadotropic pituitary hormones, these endocrine parameters do provide sufficient evidence of reproductive toxicity. Typically, adequate studies that show toxicity will have multiple samples obtained in a well-defined context that includes sex, age, reproductive state, day of cycle, and so on. Endocrine changes that indicate toxicity will include both multiple values outside the normal physiological ranges and physiologically plausible changes in direction in hormone concentrations.
Biochemical Markers of Reproductive Exposure and Effect
Various markers of exposure and effect have been investigated in male reproductive toxicology, including prostatein, androgens, and prolactin (NRC1989). Sertoli cell enzymes or biochemical secretory
OCR for page 206
products, measured in vitro and in vivo as markers of cell function, are other examples of useful endpoints for studying target organ or cell responses. Currently, however, they cannot be considered evidence of male reproductive toxicity.
In Vitro Methods
There are methods for culturing various cells from the male reproductive system, such as pituitary cells, Sertoli cells, and germ cell-Sertoli cell cocultures. Although these investigations help elucidate mechanisms of action, they cannot by themselves generate sufficient evidence of reproductive toxicity.
Several endpoints listed in
Table D-2 can provide evidence for female reproductive toxicity. For example, when a continuous-breeding study shows an adverse effect, it is desirable that the study also mate each member of a breeding pair to an untreated control to identify which member is affected by the agent. If a study has not taken this step, it cannot be said with certainty that the observed effect is the result of female reproductive toxicity; it can be equally likely that a male effect or a couple effect is involved.
Because most standard animal reproduction studies do not observe mating, they do not contain evaluations of an agent's effect on sexual behavior. If a study does report observations of mating, the failure of female rodents to assume a lordotic position and to accept mounting is evidence of abnormal sexual behavior. Additional signs include running from or fighting with the male (Uphouse and Williams 1989; Uphouse 1985).
Abnormal findings for estrous animals include persistent estrus, prolonged diestrus, or anestrus (May and Finch 1988). To characterize the estrous cycle in appropriate experimental animals, studies can use vaginal cytology or other cyclic signs in animals that menstruate,
OCR for page 206
including humans. These parameters can give information on whether cycling has discontinued or whether segments of the cycle are altered in length. Because estrous cycle length has a normal variation, it is also possible to evaluate changes in the distribution of cycle lengths. The interpretation of these data is, however, open to question. Vaginal cytology data can also be incorporated into such protocols as the continuous-breeding test, the subchronic study, and the two-generation reproduction study (Morrissey et al. 1988a,b; EPA 1998b; OECD 2000b; FDA 2000). Alterations in the distribution of estrous or menstrual cycle length alone have not been shown to be reliable predictors of reproductive toxicity. By themselves, these alterations would be insufficient to identify an agent as a reproductive toxicant.
Weight and Morphology Changes
A statistically significant decrement in ovarian or uterine weight in a study properly controlled for cyclic variation is worthy of consideration and should signal the need for additional studies. Similarly, an increase in uterine weight in an acyclic or castrate animal, or in a study that controls for cyclic variation, should raise concern about possible estrogenicity of the test agent and should suggest that additional studies are needed. Neither of these parameters, as an isolated endpoint, is sufficient to characterize an agent as a reproductive toxicant. Evaluation of the ovary often includes counts of follicles or subpopulations of follicles (Pederson and Peters 1968; Heindel 1999). A decrease in the number of ovarian follicles or a change in follicle subtype, however, is evidence of reproductive toxicity.
Secretion products of the uterus can be obtained with uterine lavage (Teng et al. 1986). Changes in uterine secretions could be useful for characterizing alterations associated with treatment because these changes can be cycle dependent, however, they can be difficult to interpret. To date, the characterization of normal changes in uterine secretory products is incomplete. Such changes alone, however, are insufficient to characterize an agent as a reproductive toxicant.
OCR for page 206
Alterations in Age at Puberty or Reproductive Senescence
In animals with estrous cycles, the onset of puberty is marked by vaginal opening. Reproductive senescence may manifest as persistent vaginal estrus followed by anestrus. A change in the age at puberty or reproductive senescence is sufficient to characterize reproductive toxicity, although it is desirable to have supporting data that explain the mechanism of toxicity.
In estrous and menstrual animals, the reproductive cycle is characterized by the production of sex steroids from the ovary in response to pituitary gonadotropins, which are under hypothalamic control. It is possible to measure the relevant hormones, but evaluators must keep in mind that the hormones are produced in a pulsatile fashion, with cyclic variation in the amplitude and frequency of the pulses. For this reason, single static measures are unlikely to be informative unless a result is well outside the normal ranges (e.g., castrate concentrations of gonadotropins). Other strategies for evaluating endocrine parameters include serial measurements of hormones in blood at short intervals, and response of an endocrine measure to a stimulus. In the serial measurement strategy, frequent sampling permits the construction of a profile of the hormone change over time, which can disclose the pulse pattern. This method is difficult in animals with small blood volumes where frequent sampling may produce its own effects.
The second method, response of an endocrine measure to a stimulus, involves sampling an animal at a fixed time after administration of a releasing factor. One can, for example, measure luteinizing hormone after injecting gonadotropin-releasing hormone or measure progesterone after injecting chorionic gonadotropin (Hughes 1988). The disadvantage of this method is the possibility that the injection of the releasing agent will cause an atypical physiological situation, so that one cannot extrapolate the effect it “unmasks” to unmanipulated animals.
If changes in concentrations of gonadal steroid or gonadotropic pituitary hormones are detected in adequately designed studies, these endocrine parameters do provide sufficient evidence of reproductive
OCR for page 206
toxicity. Results that show multiple values outside the normal physiological ranges, changes in hormone concentrations in physiologically plausible directions, or failure of key hormonal events (such as luteinizing hormone surge, preovulatory estradiol rise, maintenance of luteal phase progesterone production) provide sufficient evidence of reproductive toxicity.
In Vitro and Perfusion Systems
Tissue culture methods have been used to study ovary slices in vitro, and cell culture methods have been used for studying granulosa cells and myometrial cells. In culturing ovary slices or granulosa cells, investigators often use the release of sex steroids into the medium as an outcome parameter. Under some conditions, granulosa cells will luteinize, producing a range of steroid and nonsteroid products; of these, progesterone is measured most commonly. Some studies, however, have measured other products, including nonsteroidal substances (Haney et al. 1984; Teaff et al. 1990). Some cell culture studies have made use of the contractile properties of myometrial cells for evaluating the potential of agents to alter uterine activity. In all of these test systems, the artificial nature of the in vitro setting can limit the predictive value of the results.
Ovaries perfused in vitro are useful systems for studying the mechanical aspects of ovulation. The preparations allow observations on the effects of agents in preventing rupture of the follicle and expulsion of the oocyte. The perfusion system is artificial, however, and the relocation of the ovary from peritoneal cavity to the perfusion chamber can alter the mechanical features of the system. For this reason, data from perfusion studies are not, in themselves, sufficient for drawing conclusions about an agent's reproductive toxicity.
Any change observed in an in vitro or organ perfusion system should be considered supplemental. Isolated findings of studies that use these systems are insufficient to characterize an agent as causing reproductive toxicity.
Changes in breast histopathology or in breast milk amount or
OCR for page 206
composition should signal the need for additional studies, and in particular, the need for studies that evaluate the effect of such changes on the nourishment and health of the offspring. The mere presence of xenobiotics in milk is not, by itself, evidence of toxicity; however, if a test agent is concentrated in milk, this should prompt recognition of the need for studies on the nursling. Conversely, if an agent is not transferred into the milk in rodent studies, but it is clear that exposure to critical organ systems continues in utero at the same developmental stages in humans, it may be appropriate to conduct direct dosing studies in rodents to determine any potential effects on the structural and functional development of these systems.