Read "Behavioral Measures of Neurotoxicity" at NAP.edu

« Previous: Computerized Performance Testing in Neurotoxicology: Why, What, How, and Whereto?

Page 395 Cite

Suggested Citation:"The Scope and Promise of Behavioral Toxicology." National Research Council. 1990. Behavioral Measures of Neurotoxicity. Washington, DC: The National Academies Press. doi: 10.17226/1352.

Page 396 Cite

Page 397 Cite

Page 398 Cite

Page 399 Cite

Page 400 Cite

Page 401 Cite

Page 402 Cite

Page 403 Cite

Page 404 Cite

Page 405 Cite

Page 406 Cite

Page 407 Cite

Page 408 Cite

Page 409 Cite

Page 410 Cite

Page 411 Cite

Page 412 Cite

Page 413 Cite

Page 414 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

The Scope and Promise of Behavioral Toxicology Bernard Weiss Behavioral toxicology (BT) has almost ceased to be a term that arouses quizzical or bemused expressions in the more orthodox ven- ues of its parent discipline. At the same time, its full scope and potential remain largely unappreciated and unexploited. The range of questions it can be used to ask and the unique perspectives it can provide on certain issues so far exceed what has been demanded of it. This chapter aims to illustrate or identify some of its unused capabilities, and to indicate those that need further development. Its special focus is how behavioral measures have expanded our previous views, based on traditional criteria, of what constitutes toxicity, and the nature of the new issues that this expanded perspective fosters. Foremost among these issues is how behavioral endpoints are to be treated in risk assessment. Almost the entire risk assessment process is designed around can- cer (National Research Council, 1983~. What are called systemic toxicants, such as those acting on the nervous system, are evaluated by a wholly different set of principles. The difference stems from a presumed biological dichotomy. The induction of carcinogenesis is assumed to have no threshold. A single molecular event, such as a transcription error in DNA, can generate carcinogenesis and, it is assumed, can arise from the action of a single molecule of a carcinogenic agent. Systemic toxicants, in contrast, are presumed to exhibit thresholds, perhaps at the point at which they overwhelm compensatory mechanisms. 395

396 BERNARD WEISS This doctrine of distinct biological modes of action finds expression in the current approaches to risk assessment. The first step in conventional risk assessment is hazard identifica- tion, which can be based on either epidemiological or experimental data. This first step is crucial because of the regulatory apparatus activated when a substance is classified as carcinogenic. The next step, dose-response assessment, almost always is based on high-dose animal or human data. These data are used in extrapolation modeling to compute predictions of cancer probability at low dosages. Because of the biological assumptions, only a zero dose of a carcinogen is assumed to add no increment of risk. Exposure assessment estimates the levels of the agent to which the target population is exposed. Together with the dose-response model chosen for extrapolation, the estimated risk of cancer can then be calculated for that population in a step called risk characterization. Systemic toxicants, such as those acting on nervous system tissue, are viewed from a totally different perspective. Instead of coupling a risk estimate with an exposure level, some arbitrary threshold is defined, then divided by a safety factor or uncertainty factor to yield an acceptable daily intake. Such thresholds are described as effect levels of various kinds (Klaassenet al., 1986~. The no-observed-effect level (NOEL) refers to an exposure level offering no statistically significant increases in either frequency or severity of response in an exposed, compared to a control, sample. A lowest-observed-adverse-effect level (LOAEL) refers to the lowest exposure level producing statistically significant increases in the frequency or severity of adverse responses. Other effect levels are defined by similar standards. The suitability of effect levels for risk assessment is now being questioned in many quarters and for many reasons. First, how are adverse effects defined? Conventionally, they include any effects that impair function, result in lesions, or inhibit an organism's ability to respond to additional challenges, so that effect levels depend on the specific endpoint chosen as the critical one. Moreover, some critics contend that these do not distinguish between reversible and irreversible effects, between immediate and delayed effects, and between agents that may be rapidly eliminated and those that remain in the body for extended periods. A second objection to the effect threshold concept is statistical. It makes use of only one point on the dose- consequence function rather than the entire function and essentially ignores the size of the experiment. Third, the uncertainty factors by which the NOEL, for example, is divided to provide a safety margin for population exposure, are also arbitrary and fail to make optimal use of experimental data. All of these objections have encouraged

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 397 speculation that the highly developed cancer model for risk assess- ment might be adapted for systemic toxicants. The standard risk assessment protocol derived from carcinogenesis might be modified for neurotoxicants, with extrapolation to the origin (zero dose, zero added risk) replaced by another function, such as a threshold model, if the standard protocol were adequate. Neurotoxicants, however, introduce a complication: the stage of risk characterization, instead of becoming a matter simply of finding the intersection of dose and risk probability, turns into a complex weighing of endpoints and their measures. The complications are especially difficult to resolve when the endpoints are behavioral in content. To grasp this point, it helps to begin with a review of the history of BT and some of its special properties. ANTECEDENTS OF BEHAVIORAL TOXICOLOGY Although BT arrived on the scene, at least in the United States (Weiss and Laties, 1975), less than two decades ago, it had a plethora of antecedents that serve to explain its unusual position in toxicology and why it is difficult to mold into the conventional risk assessment process. BEHAVIORAL PHARMACOLOGY With the introduction of the minor tranquilizing and antipsychotic drugs in the 1950s, and the demonstration that chemotherapy could be a legitimate option in the treatment of behavior disorders, an intense search began for new agents ~ It was accompanied by a swelling interest in the behavioral mechanisms underlying the clinical actions of these drugs. These two developments combined to establish a technology and a discipline hospitable to their goals. Behavioral pharmacology grew out of the extensive literature of experimental psychology, particularly that aspect of it called the experimental analysis of behavior and associated with applications of what is known as schedule-controlled or operant behavior (Iversen and Iversen, 1981~. Much of the early work in behavioral pharmacology was built on the power of coupling behavior and its consequences in prescribed ways known as schedules of reinforcement. "By manipulating correlations between specified behaviors, such as lever presses by rats, and the subsequent delivery of food pellets, it was possible to generate patterns of behavior that proved differentially sensitive to various kinds of drugs and that also provided the basis for analyses of such differen- tial sensitivity.

398 BERNARD WEISS Translating this technology into one suitable for toxicology proved a fairly easy task because most of the questions posed to behavioral pharmacology were essentially questions in selective toxicity. Toxi- cology did not grapple, however, with its heritage from the central theme of behavioral pharmacology: toxicant-behavior interactions. It remained centered on whether a particular agent deserves to be la- beled neurotoxic. Yet, if any single principle can be identified as the dominant product of behavioral pharmacology, it is that the nature of a behavioral response to a chemical challenge depends on the characteristics of the behavioral situation. At the same time and in the same organism, a drug might elicit one kind of response pattern, such as an increase in rate, under one schedule, whereas it elicited a decrease in rate under another schedule or schedule variant. On the basis of our experience with drugs, the question ought to be how to interpret the modifications produced by exposure. Simply specifying schedule-controlled behavior as one component of a screening battery, while ignoring the interaction, is unlikely to yield significant contributions to BT as a science and could evoke considerable confusion (see MacPhail, this volume) if the resulting data were to be used to calculate some version of a threshold. Some confusion already exists because of the different aims of BT and behavioral pharmacology (Weiss, 1984~. Central nervous system (CNS) drugs are administered therapeutically at doses great enough to influence behavior, so that behavioral pharmacologists study high doses either to try to detect active agents or to differentiate their behavioral effects. A BT study might entail these aims as well, but must always consider the importance of its findings for risk estimates, which implies action at low doses. Wood and Cox (1986), for example, measured the response rates of rats exposed to toluene vapor and performing on a reinforcement schedule that maintained fairly stable rates under control conditions. They chose to study exposure concentrations considerably lower than those used in past investiga- tions with rodents. They observed that toluene exposure, at levels close to those used in experimental studies with humans, and even approximating the threshold limit value, elevated rates above control levels. A typical strategy for dose selection, however, would have begun with rather high levels, would have observed rate decreases, would then have lowered the concentration to a point at which no rate decreases occurred, and would have missed toluene's rate-enhancing properties at low concentrations. We do not have to rely on neurotoxicology alone to document the futility of high doses when the aim is risk estimation. The connection between lead exposure and hypertension emerged from an analysis of the the Second Na-

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 399 tional Health and Nutrition Examination Survey (NHANES II) data, which showed the steepest effects at low doses. Animal data had indicated such a relationship (Victery et al., 1982) but failed to attract attention because the results conflicted with our stereotyped expectations. What we find currently in most assessments of animal behavior is an arbitrary selection of experimental parameters combined with relatively high exposure levels. We seem to have appropriated a technology without an appreciation of what that transfer of technology requires to make it work. Workplace Exposure Criteria The first industrial hygiene legislation on record was prompted by the manifestations of mercury poisoning in miners who worked the famous mines at Idria. Among these manifestations are tremor and a collection of psychological complaints. Behavioral disturbances were also listed in descriptions of many other workplace toxicants by pio- neers in occupational hygiene, such as Ramazzini in the eighteenth century and Hamilton in the twentieth century. Formal recognition was embodied in the Threshold Limit Values (TLVs) issued by the American Conference of Governmental Industrial Hygienists (ACGIH). Note its description of the short-term exposure limit (STEL): "the maximum concentration to which workers can be exposed for a period of up to 15 minutes continuously without suffering from. . . narcosis of sufficient degree to increase accident proneness, impair self-rescue, or materially reduce work efficiency. . . " (ACGIH, 1974~. Such a definition of safety implies quantitative information about performance capacity that would have to be acquired under experimental conditions. Adequate information is sparse. Anger and Johnson (1985) estimate that about 25 percent of the workplace chemicals for which TLVs exist are neurotoxic, but the volume of pertinent toxicity data is far less than warranted by such a role. The TLVs are supposed to protect against adverse effects during a working lifetime. For substances such as organic solvents, however, they have been based on a combination of impressionistic clinical data, some epidemiology, and observations of acute effects. A chronic syndrome, described extensively in the Scandinavian literature, has also been described. It comprises signs such as a slowing of responses, memory difficulties, and personality disturbances. The validity of such a syndrome has aroused robust debate, but even critics acknowledge its confirmation in workers exposed to carbon disulfide (Grasso et al., 1984~. Several of the chapters in this volume describe this syndrome and

400 BERNARD WEISS the research programs and techniques designed to extend the scien- tific basis for exposure standards. We must acknowledge the enormous contributions of these programs to shaping our views of how the criteria for workplace safety should be formulated. Yet the entire literature suffers from an inherent conflict between eagerness to ap- ply these views and aptness of the techniques on which these appli- cations depend. Convenience ease of administration, standardization, and testing time are dominant concerns and, in response, the concerns are met by collecting a series of tests into a battery. Sometimes, however, convenient technology can be misleading. For example, several investigations have relied on a device called the Optacon to assess somesthetic sensitivity. The device itself, and the psychophysical procedures governing testing, are wholly inad- equate for such a purpose, as so cogently discussed by Maurissen (1988~. Another example is the various reaction time measures included in many batteries. Again, because of unfamiliarity with the psychophysical literature, authors may fail to specify stimulus values, despite the body of knowledge indicating that stimulus intensity is inversely re- lated to response latency. In fact, reaction time can be used to plot psychophysical functions (Stebbins, 1970), so that one could question whether some of the results with solvents, for example, represent altered "cognitive" function or sensory deficits. The conflict between convenience and sensitivity obstructs the usefulness of many of these batteries for risk estimation. Is it legitimate to argue that, because they probably yield underestimates of impaired function, safety standards such as TLVs require the application of uncertainty factors based on groups showing deficits attributed to exposure? Or can the argument about sensitivity be used to question the validity of the findings, a tactic used in toxic tort cases? Finally, if an arbitrarily safe exposure standard is the aim of such research, what model is to provide such a standard and what are the associ- ated risks? Standards in the Soviet Union Discrepancies between the workplace and community exposure standards accepted in the West, and those prevailing in the Soviet Union, which tend to be much lower, have generated considerable speculation about their sources. One source surely was the image of virtue to be gained by USSR standards that seemed more rigorous than those adopted by capitalist nations. Other sources should be recognized as well, however. The most important was the doctrine flowing from the history of

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 401 Soviet science, and the overwhelming authority of I.P. Pavlov, that measures of CNS function should play a major, even dominant, role in assessing safety and prescribing exposure standards. Tissue damage occupied the corresponding role in the West and still remains domi- nant. Pavlenko (1975), discussing methods for toxic assessment of the CNS, notes Pavlov's assertion that ". . . the animal organism as a system is able to survive in its natural environment only if a dynamic equilibrium is maintained between this system and its environment. This is achieved, in higher animals, chiefly through the agency of the nervous system and by means of reflexes." Although Western scientists tend to view Soviet data with some skepticism, in part because the standards of scientific publication seem to be looser in the USSR, the Soviet approach still managed to generate enough curiosity in the West to stimulate tests of its validity. One aspect of Soviet doctrine that still separates it from Western toxicology, however, is the principle that any deviation from baseline functional parameters due to toxic exposure must be interpreted as an adverse effect (Glass, 1975~. Western scientists may interpret such effects as evidence of adaptation, much like the change in vital capacity produced by physical training. Simultaneously, however, Western toxicologists must at some time resolve which behavioral endpoints denote toxicity. Some of the more conventional practitioners insist that behavioral changes of a transient nature, unaccompanied by pa- thology, do not warrant the label of toxicity. Such a narrow definition is probably no longer tenable, but what are its limits? Ozone occupied such an ambiguous niche not long ago. It was recognized as a potent lower-airway irritant and as a source of pathological changes in the lung at high doses. Questions about its toxicity at low environmental concentrations have been answered satisfactorily only recently. Inhalation toxicologists now can document adverse pulmonary effects, at least as a consequence of chronic exposure, at levels permitted by current regulations. Yet consider the problem of how to interpret findings such as those published by Weiss et al. (1981), Tepper et al. (1982), and Tepper and Weiss (1986~. Weiss et al. (1981) trained rats to respond on a fixed-interval schedule of food reinforcement and measured response rates during 6-hour exposures to ozone. They observed a reduction in lever-pressing rate at concentrations of 0.5 ppm and above. To provide a contrast with this kind of sedentary behavior, Tepper et al. (1982) allowed rats access to running wheels during a 12-hour period and exposed them to ozone during the middle 6 hours. A concentration as low as 0.12 ppm, the level deemed by the regulations issued under the Clean Air Act as what might be considered a surrogate for an effect threshold, reduced running. This

402 100 - o 75- z o o 50- UJ a: 25- Legend o REVOLUTIONS it. · LEVER PRESSES ~ O- BERNARD WEISS 120- 60 - N = 4 120 RAT 1 ~~ 60 I\ 1 20 it. 60 1 i<\- ~o O 120 RAT 2 60 ----I ~ 1 1 0 - 1 ~ I! 1 · · ~ 0U 0.1 0.S 0.1 0.5 OZONE ppm ·0~\ RAT 3 Cot RAT 4 -I ~1}'-- 1~ ..... . . .. 0.1 0.5 FIGURE 1 Results of an experiment In which rats pressed a lever, attached to the inner wall of a running wheel, to release a brake that locked the wheel. After prelimi- nary training, the rats were required to make five lever presses (a fixed-ratio 5 scheule of reinforcements to release the brake for a period of 15 seconds. Rat 1 remained on fixed-ratio 1. Access to the running wheel occurred during the last hour of a six-hour period of exposure to ozone. Three of the four rats showed a reduction In both lever presses and wheel revolutions at 0.08 ppm ozone; the Environmental Protection Agency standard is 0.12 ppm. SOURCE: Tepper and Weiss (1986). reduction was largely the product of lengthened pauses between bouts of running and can be interpreted, not as toxicity per se, but as behavioral adaptation. That is, reduced motor activity reduces minute volume which, in turn, reduces pulmonary uptake of ozone and, finally, the aversive consequences of running in an ozone-enriched environment. The later paper (Tepper and Weiss, 1986) described an experiment in which rats pressed a lever, while in the running wheel, to release a brake and so secure an opportunity to run. A reduction in the fre- quency of this behavior, which can be interpreted even more directly as avoidance of the aversive consequences of exercise, occurred at an ozone concentration of 0.08 ppm (Figure 1~. Are these behavioral data to be adopted as evidence of ozone toxicity at these low levels, as they would in the USSR? Do they simply indicate that exercise is aversive at these concentrations? This is one arena in which behavioral observations offer a unique problem for risk assessment and regulation: Can an exposure level that elicits avoidance of the consequences of

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 403 exposure be considered an adverse functional or toxic effect (Weiss, 1989~. Public Awareness With the stirrings of the environmentalist movement in the United States, public concern began to shift from the grosser aspects of toxic damage to the more subtle ones, especially those arising as the conse- quence of low-level, prolonged exposure. Although cancer was featured, it was inevitable that the public would begin to ask questions about the coupling of environmental chemicals and "mental disease," for example. Fifteen years ago, when the Environmental Protection Agency (EPA) was preparing to respond to what later became the Toxic Substances Control Act, legislators in the United States were already drafting requirements that behavioral disturbances be included among the criteria of adverse effects. At present, several legislative initiatives and federal agencies define and regulate chemical exposures and in- clude behavior among the aspects of toxicity to be considered in de- termining safety. The removal of lead from gasoline can be attributed to the mounting evidence showing an inverse relationship between intelligence test scores in children and indices of lead exposure. The public, however, remains largely uneducated about such issues. Even the media still use terms such as lead poisoning to describe the impact of lead exposure on test scores. Terms such as lead poisoning imply that risk and the associated calculation of acceptable exposure standards can be defined, like cancer, by number of cases, but the lead issue has been defined by a different metric. The paper by Bellinger et al. (1987), whose results are supported by several other groups, offers a clear example. They compared scores, on the Bayley Scales of Infant Development, of three groups of children 24 months of age. The groups were differentiated by lead concentra- tions in cord blood: low (a mean of 1.6 ~g/dL); medium (a mean of 6.5 ,ug/dL); and high (a mean of 14.5 ,ug/dL). As observed in other chapters, a few years ago the high-lead group would have been regarded as a low-lead group. All the groups attained scores above average on the Bayley Scales, but the differences between the high-lead group and the other two groups came to about 8 percent. Given the above-average scores in the high-lead group, none of the children could be identified as cases, for example, of mental retardation. Also, 8 percent, although statistically significant, is a degree of variation that is often encountered on retesting. The clearest appreciation of this difference is found in its implications for the community. Figure 2 compares two distributions of intelligence test scores. The

404 BERNARD WEISS IQ SCORE 50 0.034 0.02 - 0.01 - ~ 0.00 m m o ct cot 0.02 - 0.01 - 70 90 110 130 lS0 , , , , , , ~ I: / ~ ~- ,W, a. ::.: _... ..... ....... :. / \ , , . . , , , . . ~ so 70 90 110 \30 150 IQ SCORE FIGURE 2 Plot describing implications of a 5 percent shift In intelligence test scores. The upper curve depicts the distribution of intelligence test scores for instruments such as the Stanford-B~net and Wechsler Intelligence Scale for Children. Its mean is 100 and its standard deviation is 15. In a population of 100 million, 2.3 million individuals will score above 130, as shown by me stippled area In the upper tail of the distribution. If the distribution is shifted by 5 percent, or one-third of a standard deviation, to a mean of 95, only 990 thousand individuals will score above 130. Bell~nger et al. (1987) observed a difference of 8 percent on the Mental Development Index of the Bayley Scales of Infant Development, between children whose cord bloods fell into a group with a mean lead concentration of 14.5 ~g/dL and Dose In groups with lower means. SOURCE: Weiss (1988). upper chart depicts a distribution with a mean of 100 (the defined average) and a standard deviation of 15 (as found, say, on the Stanford- Binet). In a population of 100 million, 2.3 million will score above 130. The lower chart depicts a distribution with a mean of 95, or a reduction of 5 percent. In such a population of 100 million, only 990 thousand individuals will score above 130 (Weiss, 1988~. The impli-

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 405 cations for a society are staggering. Yet, they are impossible to con- vey with the standard model of risk assessment, which counts cases. Toxic Torts Litigation is exerting a significant impact on the acceptance of be- havioral criteria of toxicity. Courts in the United States are now evaluating suits by workers claiming injury from exposure to organic solvents, metals, pesticides, and other neurotoxic substances. Such claims take the form of impaired intellectual performance, impaired sensory function, subjective complaints, and other indications of nervous system damage. These suits are now prompting segments of industry, which previously had tended to ignore behavioral assays in chemical development and workplace safety, to inaugurate programs responsive to these newer facets of toxicology. Moreover, legislation such as the Gaydos-Metzenbaum bill is prompting further review of the impact of subtle toxicity on worker health. Behavioral toxicology will be forced more and more, as the legal system responds to these issues, to make explicit the sources of its conclusions and to defend them. Legal arguments have a way of challenging vagueness, possibly to the disadvantage of BT and some of its practitioners, because then we will have to offer statements about probability. When workplace exposure is at issue, how convincingly can we argue, for an individual, that a particular collection of signs and symptoms was a likely or unlikely outcome of a work history? What degree or proportion of responsibility can we allocate to the work environment and what proportion to other factors, such as the personal habits of the individual? The test batteries devised to assay the neurobehavioral consequences of workplace exposure to various substances, which are the substance of the scientific arguments advanced in toxic tort cases, convey a great deal of ambiguity when used to support decisions about individuals. Perhaps such ambiguity is inevitable, given the problem of multiple chemical exposures in many work historiescomplicating attempts to extract characteristic test profiles but an emphasis on differential exposure histories and even rudimentary dose-response analysis would yield more effective instruments in the end. SPECIAL CHARACTERISTICS OF BEHAVIORAL TOXICOLOGY One message to be extracted from this list of predecessors is that BT is still in search of an identity. We inherited certain techniques and viewpoints, but still have to synthesize them into a mature discipline.

406 BERNARD WEISS To move toward such a synthesis requires not just refinements of borrowed technology, but a technology and viewpoint uniquely our own. Viewpoints determine technology, so we will have to examine those that are special to BT and imbue it with some of the properties that make it a unique challenge for risk assessment. Basic Themes A review of our history distinguishes two themes, which could be termed validation and amplification, that have emerged with the de- velopment of BT. The validation theme is embodied in the process of hazard identification, the first step in the conventional risk assessment process. At this stage, in contrast to cancer, emphasis falls on estab- lishing the spectrum of toxicity associated with a particular agent, and the aim of research directed to such questions has been to develop adequate screening methods. Sensitivity is a secondary goal in these programs because extrapolation is only an implied, and not a direct, requirement or even role for such screens (e.g., Tilson et al., 1979~. A second type of validation embodies animal models of the kind discussed in this volume by Russell and by Overstreet. These models seek to mimic neurological diseases, such as parkinsonism and Alzheimer's disease, whose etiology currently is suspected by some to result from neurotoxic processes. Here, the validation theme takes the form of chemically induced lesions and behavioral endpoints that are assumed to be analogues of human function such as short-term memory. The second theme, which I can amplification, addresses risk estimation directly and its ultimate goal of coupling exposure levels with risk incidence or severity. This goal would be essentially the next step, after hazard identification, in the risk assessment process. It may be undertaken either as an expansion of observations where humans served in the role of sentinels or as the successor to laboratory findings that have documented the existence of a hazard. No entirely new substance has yet passed through the defined phases of the conven- tional risk assessment process. Our literature is based almost exclu- sively on agents already defined by human exposure. Hazard iden- tification has been pursued mainly as a validation process based on recognized toxicants. Dose-response (and dose-effect) phases have typically been conducted, in the laboratory, as programs to establish validity by demonstrating such relationships. Efforts to provide a basis for dose extrapolation to humans remain minimal. The calculation of risk based on neurobehavioral criteria is complicated by the variety of prototypical situations in which adverse effects might

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 407 appear. Exposures may be either acute or chronic; consequences may be either reversible or irreversible, progressive or stable. Some effects may remain latent, only to emerge with time, perhaps in advanced age, when the reserve capacity of the nervous system has been de- pleted. The anesthetic properties of volatile organic solvents, for example, represent an acute reversible situation, but consistent exposures may lead to progressive deterioration that eventually becomes irre- versible. Delayed irreversible effects are associated, for example, with MPTP exposure in adults and methylmercury exposure in the fetus. Clinical and Behavioral Criteria For all these categories, our past evaluations of adverse effects were based largely on clinical endpoints, still the main basis for esti- mates of the hazards of systemic toxicants acting on organs other than the central nervous system. The adequacy of clinical criteria for risk assessment is questionable. Clinical criteria are especially flawed when neurotoxicity is expressed by a gradual, progressive erosion of functional capacity. Consider the reasons for trying to develop and refine psychological test procedures sensitive to the early manifestations of Alzheimer's disease. By the time a patient comes to the attention of clinicians, he or she has already progressed to a stage that has captured the concerns of fam- ily members. At that point, an accurate diagnosis is not an especially formidable challenge. Even though the currently available test proce- dures cannot comfortably differentiate victims of the disease from controls, except in group designs, they remain vastly superior to the clinical examination in defining the areas and extent of functional deficit. Their precision is certain to improve now that the vast amount of research on the psychological deficits of Alzheimer's disease is being embodied in potential diagnostic procedures. One of the most cogent examples of the difference between clinical standards and psychological test or experimental design standards is surely lead toxicity. I discussed earlier the novel way in which the risks of lead exposure should be formulated, an example of the way in which the amplification process works. Cory-Slechta (this volume) traces the progressive lowering, over the past four decades. of the blood levels accepted as hazardous to children and detectable in animal models. Such a progression is now evolving with methylmercury, which currently is undergoing an amplification process. Although it has many features in common with how our views of lead toxicity evolved, it has distinct features of its own that make it an appealing model. One important feature is our extensive knowledge of methyl-

408 BERNARD WEISS mercury neuropathology. Another is our ability to trace exposure history by the analysis of methylmercury in hair. The third is its specific effects on special systems, such as vision. The fourth is the narrow focus of its toxicity: unlike lead, which exerts significant ef- fects on hematopoiesis and blood pressure, methylmercury exerts only minimal effects beyond the nervous system. The fifth feature is the often prolonged latency to overtly detectable effects during or following exposure. Methylmercury as Prototype Most current concerns about methylmercury arise from its potency in the developing human. Minamata suggested, and Iraq confirmed, that the fetus and neonate are far more sensitive than the adult. Clarkson and his colleagues, in a series of analyses based on the Iraq disaster (e.g., Clarkson et al., 1981), now suggest that the fetus may be as much as ten times more vulnerable than the adult to methylmercury. Such calculations are based on the appearance of paresthesias in adults with total body burdens of 25 mg and of retarded motor development, of a type leading to a diagnosis of cerebral palsy, in children whose mothers accumulated a body burden of 2.5 ma. These are clinical criteria based on examinations conducted in rural Iraq, not on the kind of careful neuropsychological evaluation possible in major medical centers. It is provocative to consider what kind of results might have emerged from the application of what are considered to be more sensitive and specific tests currently found in neuropsychological testing centers, such the Bayley Scales of Infant Development used by Bellinger et al. (1987~. Even such instruments are crude compared to the tools described in the current literature of child development, although they offer the virtue of standardization. Given our experience with lead, we might predict that reliance on even these imperfect instruments could amplify sensitivity by a factor of four or five. Research now in progress suggests that, in fact, the developmental neurotoxicity of methylmer- cury might have been underestimated, on the basis of clinical criteria, by almost such a factor. It might be equally provocative to imagine the conclusions that would have been fostered by the kinds of schemes now envisaged for identifying neurotoxicity and then for extending it to risk estimates. Gross neurotoxicity in developing animals would surely have been identified at high doses by observations of developmental disorders. In rats, a massive study designed to evaluate reproducibility of behavioral observations between laboratories, the Collaborative Behavioral Teratology

SCOPE AND PROMISE OF BEHA VIORAL TOXICOLOGY 409 Study (CTBS), chose 6 mg/kg, administered on gestation days 6-9 or 12-16, as the high dose on the basis of a preliminary study (Buelke- Sam et al., 1985~. Most of the six participating laboratories would have selected that dose, a total of 24 mg/kg, as the LOAEL, on the basis of indices such as maternal and offspring weight gain together with a variety of behavioral indices. Unfortunately, the protocols included neither sensitive morpho- logical indices nor measurements of methylmercury tissue levels, so that a direct comparison with neurotoxic health risk estimates based on human data is not feasible, but crude parallels can be constructed from knowledge of levels prevailing in fish. The Food and Drug Administration action level is 1 ppm. Most swordfish exceed this level. Shark, an increasingly popular seafood, has an even higher content than swordfish. Freshwater pike from the Adirondacks, because of acid rain, typically exceed 1 ppm of methylmercury as well. Assume that a pregnant female consumes seven fish meals, within a one- month period, of a species at the FDA action level. If each meal consists of 240 grams of fish, she will accumulate a body burden of 1,680 grams. For a body weight of 70 kg, this amounts to 24 ,ug/kg, or 1/1,000 of the rat-based LOAEL. Such a body burden is equivalent to what is now suggested to be the human LOAEL. However, there is another provocative feature to methylmercury that might multiply our risk estimates even more. Spyker (1975) maintained mice, after prenatal treatment, for a lifetime. In mice that, until that time, had manifested no adverse effects, neurological disorders began to appear at about 15 months of age, and even in superficially healthy mice, behavioral testing revealed functional im- pairment. As the mice aged, they revealed more and more disorders. These observations are a powerful argument for longitudinal studies, but an even more powerful argument for including such possibilities in risk assessment. Spencer (this volume) offers a compelling argument that earlier cycad exposure may trigger the eruption of the amyotrophic lateral sclerosis/parkinsonism-dementia (ALS-PD) syndrome even decades later. Individual Differences More than other areas of toxicology, BT is sensitive to individual differences. Other disciplines typically model results solely as means, or even, as in carcinogenesis, within a stochastic model that relates total exposure in a population to number of tumors irrespective of the distribution of exposure. Some of our sensitivity perhaps stems

410 BERNARD WEISS from the historical junction of diagnosis with psychometrics; some of it may stem from our laboratory experiences with allegedly homoge- neous groups of animals whose members all seem to exhibit unique experimental personalities, especially when we trace the development of a process such as learning. Despite our awareness (sometimes subliminal) of individual differences, most papers in BT, like most papers in toxicology, assume that subjects (animal and human) come from a uniform population and treat the data, as well as the design, accordingly. It is not the best approach to defining the characteristics of low-dose, chronic exposure. Physiologists are also now question- ing the usefulness of group analyses without the data of individual subjectsa tradition exemplified by operant experimenters. Even acute experiments may lead to wayward conclusions if indi- vidual differences are ignored. Ben F. Feingold was a pioneering pediatric allergist who formulated the hypothesis that some of the children labeled as hyperactive were actually responding to certain constituents of the diet (Feingold, 1975~. Although he singled out synthetic colors and flavors, mostly because he doubted their nutritional value, his hypothesis had roots in an extensive allergy literature, but he never held that all children with that label suffered from excessive sensitivity to additives. I have reviewed the experimental data generated by the Feingold hypothesis on several occasions (e.g., Weiss, 1982, 1986a). Two gen- eralities arise from those data. First, Feingold was correct in prin- ciple: some children respond adversely to colors and perhaps to other additives. Second, the prevailing view that the Feingold hypothesis has been disproved is mostly attributable to the naive statistics practiced by experimenters and reviewers alike. Some flaws stem from the assumption of a uniform population and are illustrated in Figure 3. Assume a population comprised of 70 percent nonresponders and 30 percent responders (A). Then assume a treatment that shifts the re- sponders by one standard deviation (B). The distribution (C) shows the results for the sample as a whole; the difference in means is hardly visible. The usual procedure for extrapolating animal data to human standards imposes a safety or uncertainty factor to compensate for wide individual differences in sensitivity. Although no one disputes that such differences exist, that recognition exercises little influence on experimental design and analysis. Even in the laboratory within a group of rats of the same age and strain, we see remarkable differ- ences in the behavioral response to toxicants such as lead and have had to develop special statistical techniques to quantify these differ- ences.

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY .40 - .30 O .20 o .10 ICY PIP it_ O' id, , , , , , , 4.0 8.0 12.0 16.0 20.0 A .40 - .30 - 2 o .20 - o .10 - SCORE Cop m ~1 O 4: I · i I ~ ~ ~ 49 8.0 12.0 16.0 ~.0 C SCORE B 411 .40 - .30 - .20 - .10 - .20 - .10 - l _ ~ - , i i , , O- _ NOR ire OR a. / _ . _-In ~ ~ ~ 1 o 4.0 8.0 12.0 16.0 20.0 SCORE FIGURE 3 Hypothetical distributions showing interpretive difficulties arising from studies of populations comprised of both responders and nonresponders. (A) Distri- bution of scores, before toxic challenge, in a population consisting of 70 percent nonresponders (taller distribution) and 30 percent responders (shorter distribution). (B) Distribution of scores, shown separately for responders and nonresponders, to a toxic challenge that displaces the responders by one standard deviation. (C) Combined distribution, shown by heavy line, of nonresponders and responders. The difference in means, enclosed by the vertical dotted lines, indicates that even a significant displacement of the responders alters the mean of the distribution only slightly under these circum- stances; only with large samples could such an effect be detected consistently.

412 BERNARD WEISS THE REMOTE FUTURE Behavioral toxicology first emerged as an alternative to traditional markers of toxicity such as tissue damage and as a potential reservoir of more sensitive methods for measuring toxicity. Impelled, per- haps, by regulatory questions, it veered from the sensitivity issue toward the development of techniques for the detection of neurotox- icity. Much of it remains clasped in the identification phase of risk assessment, a status that tends to isolate it from new advances in behavioral and neuroscience and that negates much of its early promise. Risk assessment is viewed as the critical coupling of toxicological science and public policy. Behavioral toxicology surely has more to offer this process, and much more to extract from it, than a list of procedures. What other discipline is in the unique position of access to a technology for tracing a progression of toxicity from early, subtle effects to clear impairment? What other perspective on toxicology can integrate such a rich configuration of endpoints (Weiss, 1986b)? If BT abandons its early promise by confining itself to narrow ques- tions of techniques for identification, it could rupture its close rela- tionship with the major concerns of public health. The science will be the greater victim. ACKNOWLEDGMENT Preparation supported in part by grants ES01247, ES01248, and ES044929 from the National Institute of Environmental Health Sciences. REFERENCES American Conference of Governmental Industrial Hygienists. 1974. Documentation of the Threshold Limit for Substances in Workroom Air, third edition. Cincinnati, Ohio: ACGIH. Anger, W. K., and B. L. Johnson. 1985. Chemicals affecting behavior. Pp. 51-148 in Neurotoxicity of Industrial and Commercial Chemicals, J. O'Donoghue, ed. Boca Raton, Fla.: CRC Press. Bellinger, D., A. Leviton, C. Waterhaus, H. Needleman, and M. Rabinowitz. 1987. Longitudional analysis of prenatal and postnatal lead exposure and early cognitive development. New England Journal of Medicine 316:1037-1043. Buelke-Sam, J., C. A. Kimmel, and J. Adams, eds. 1985. Design considerations in screening for behavioral teratogens: Results of the collaborative behavior teratology study. Neurobehavioral Toxiciology and Teratology 7:537~73. Clarkson, T. W., C. Coc, D. O. Marsh, G. J. Myers, S. K. Al-Tikriti, L. Amin Zaki, and A. R. Dabbagh. 1981. Dose-response relationships for adult and prenatal expo- sures to methylmercury. Pp. 111-130 in Measurement of Risks, G. G. Berg and H. D. Maillie, eds. New York: Plenum. Feingold, B. F. 1975. Why Your Child Is Hyperactive. New York: Random House.

SCOPE AND PROMISE OF BEHAVIORAL TOXICOLOGY 413 Glass, R. I. 1975. A perspective on environmental health in the USSR. Archives of Environmental Health 30:391-395. Grasso, P., M. Sharratt, D. M. Davies, and D. Irvine. 1984. Neurophysiological and psychological disorders and occupational exposure to organic solvents. Food and Cosmetic Toxicology 22:819-852. Iversen, S. D., and L. L. Iversen. 1981. Behavioral Pharmacology, second edition. New York: Oxford University Press. Klaassen, C. D., M. O. Amdur, and J. Doull, eds. 1986. Casarett and Doull's Toxicol- ogy, third edition. New York: Macmillan. Maurissen, J. P. J. 1988. Quantitative sensory assessment in toxicology and occupa- tional medicine: Applications, theory and critical appraisal. Toxicology Letters 43:321-343. National Research Council. 1983. Risk Assessment in the Federal Government: Man- aging the Process. Washington, D.C.: National Academy Press. Pavlenko, S. M. 1975. Methods for the study of the central nervous system in toxico- logical tests. Pp. 86-108 in Methods Used in the USSR for Establishing Biologically Safe Levels of Toxic Substances. Geneva: World Health Organization. Spyker, J. M. 1975. Behavioral teratology and toxicology. Pp. 311-344 in Behavioral Toxicology, B. Weiss and V. G. Laties, eds. New York: Plenum. Stebbins, W., ed. 1970. Animal Psychophysics. New York: Appleton-Century-Crofts. Tepper, J. L., and B. Weiss. 1986. Determinants of behavioral response with ozone exposure. Journal of Applied Physiology 60:868-875. Tepper, J. L., B. Weiss, and C. Cox. 1982. Microanalysis of ozone depression of motor activity. Toxicology and Applied Pharmacology 64:317-326. Tilson, H. A., C. L. Mitchell, and P. A. Cabe. 1979. Screening for Neurobehavioral toxicity: The need for and examples of validation of testing procedures. Neurobehavioral Toxicology l(Suppl.):137-148. Victory, W., A. J. Vander, J. M. Sherlock, P. Schoeps, and S. Julius. 1982. Lead, hypertension, and the renin-angiotensin system. Journal of Laboratory and Clini- cal Medicine 99:354-363. Weiss, B. 1982. Food additives and environmental chemicals as sources of childhood behavior disorders. Journal of the American Academy of Child Psychiatry 21:144- 152. Weiss, B. 1984. Behavior as a measure of adverse response to environmental contami- nants. Pp. 1-57 in Handbook of Psychopharmacology, Vol. 18, L. L. Iversen, S. D. Iversen, and S. H. Synder, eds. New York: Plenum. Weiss, B. 1986a. Food additives as a source of behavioral disturbances in children. Neurotoxicity 7:197-208. Weiss, B. 1986b. Emerging challenges to behavioral toxicology. Pp.1-20 in Neurobehavioral Toxicology, Z. Annau, ed. Baltimore: Johns Hopkins University Press. Weiss, B. 1988. Neurobehavioral toxicity as a basis for risk assessment. Trends in Pharmacological Science 9:52-62. Weiss, B. 1989. Behavior as an endpoint for inhaled toxicants. Pp. 492-512 in Concepts in Inhalation Toxiciology, R. O. McClennan and R. F. Henderson, eds. New York: Hemisphere. Weiss, B., and V. G. Laties, eds. 1975. Behavioral Toxicology. New York: Plenum. Weiss, B., J. Ferin, W. Merigan, S. Stern, and C. Cox. 1981. Modification of rat operant behavior by ozone exposure. Toxicology and Applied Pharmacology 58:244-251. Wood, R. W., and C. Cox. 1986. A repeated-measures approach to the detection of the minimal acute effects of toluene. Toxicologist 6:221.

Next: Appendix: Symposium Agenda »

Behavioral Measures of Neurotoxicity (1990)

Chapter: The Scope and Promise of Behavioral Toxicology

Welcome to OpenBook!

Get Email Updates