Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 55
4
DATA EVALUATION
THE DOS SIER CONCEPT
The available information on each of the 100 substances was organized
in a working document or dossier with a standardized format, content, and
method of reporting. These dossiers were the focal point for the
committees' operating policies, all document control efforts, and all
evaluations of data adequacy. Each dossier was the unit of record for
all committee decisions and actions. Each dossier contained a synopsis
of the substance's physicochemical properties, manufacturing processes,
production and consumption volumes, chemical fate, intended and other
uses, and exposure potential; a summary of the toxicity data base; and a
statement of adequacy of the complete data base. As the evaluation of
the toxicity information progressed, additional documentation was added
to each dossier until, in addition to the synopsis, it contained:
· A summary of adequacy ratings for tests that were required for a
substance's intended uses according to the standards adopted by the
Committee on Toxicity Data Elements, as well as the tests required for
occupational and environmental exposure, indicating which tests had been
performed, the documents in which they were reported, and judgments of
adequacy of the test protocols.
· A summary of the amount and quality of all information in the
dossier for assessment of the substance's potential hazard to human
health.
o Each complete document (or identifying first pages or English
summaries thereof) dealing with the substance's human toxicity and
exposure and evaluations of the toxicologic information contained in
confidential files and, for each toxicity study:
-- An annotated comparison of the study's protocol with the
appropriate reference protocol guidelines ~~ ~~
chapter).
[described later in this
-- A cover sheet preceding each study and its protocol
comparison identifying the type of testing reported in the paper, the
committee's judgments of adequacy, and reasons for judgments.
0 A data sheet detailing chemical and physical properties, chemical
reactivity in nonbiologic systems, bioavailability, analytic methods
available for detection, and known uses and exposure.
· A list of synonyms for the compound found in the Chemline, CIS,
Chemname, RTECS, and TDB automated data bases and/or the Merck Index,
Hawley's Condensed Chemical Dictionary, and the Cosmetic Ingredient
Dictionary.
55
OCR for page 56
· The names of manufacturers listed in the TSCA Inventory.
The major components of a dossier and their contents are presented in
Appendix L.
GENERAL PRINCIPLES FOR EVALUATION OF TOXICITY-TESTING PROTOCOLS
An ideal data base on the toxicity of a chemical would contain enough
information to permit the assessment of hazards and safety associated
with anticipated use and other exposure. Toxicity information obtained
from the experience of exposed humans usually is not available, and it is
common practice to use information obtained from tests on laboratory
animals. Deficiencies in a toxicity data base do not always invalidate
the use of the information to predict at least some human health effects,
but may reduce the certainty of a health-hazard estimate for that
substance.
The Committee on Toxicity Data Elements used three steps to develop a
suitable approach to the determination of toxicity-testing needs. First,
the committee reached an agreement on a strategy for judging the adequacy
of toxicity data. Second, it established guidelines for assessing the
quality of individual toxicity studies. Third, it created a
decision-making system to review and evaluate the total data base on the
toxicity of a substance. These three steps were used to determine the
extent of needed additional toxicity testing for the subsample of 100
substances. Results were used to estimate testing needs for the select
universe. Answers to three fundamental questions describe the adequacy
of the toxicity data base on a substance:
· What toxicity tests are needed for the substance?
· What tests have been performed and how well have they been done?
· Does the quality of the information permit assessment of the human
health hazard?
Although these three questions are fundamental to the overall procedure
for evaluating the adequacy of a data base, several additional, more
detailed or specific questions may be asked as each substance is examined:
· Is there at least a minimal amount of toxicity information on the
substance?
· Is there exposure information on the substance?
· Have all the tests identified as necessary been conducted?
56
OCR for page 57
· Has each required toxicity test been conducted in a manner
conforming to reference protocols or, if not, did its quality satisfy
basic criteria of scientific methods?
~ If so, are the nature and quality of test data adequate for the
assessment of health hazard?
· What documentation supports the conclusion that available data are
of sufficient quality for a health-hazard assessment or that more tests
are required?
The committee developed a procedure for determining the adequacy of
available toxicity information on a substance (see Figure 31. First, a
substance was chosen from the select universe on the basis of the
availability of minimal toxicity data, as described earlier. The next
step was a search for pertinent information, as listed in Table 5,
followed by a determination of intended uses. Next, on the basis of the
category to which the substance belonged and the exposure settings,
specific tests required to define the toxicity of the substance for each
exposure setting were identified (see Appendixes B through G). After
establishing which tests were required, the committee examined the
available information to identify both the availability and the quality
of the required tests. To estimate quality, the report of each test was
compared with a set of reference protocol guidelines. Finally, the
information was judged to be sufficient to assess the health hazard, in
which case further testing would not be needed, or insufficient, in which
case further testing would be needed.
The committee not only used data from laboratory studies for hazard
assessment, but also examined any epidemiologic studies and information
on the extent of exposure to a substance. The committee felt not only
that the results of animal experiments may provide guidance for planning
epidemiologic investigation, but, more importantly, that animal data can
be most valuable when epidemiologic evidence is weak, nonspecific, or
relatively insensitive. Conversely, good epidemiologic data minimize the
need for animal data.
CONSIDERATION OF EXPOSURE
Three exposure situations largely determine the type of potential
hazard and hence the spectrum of data appropriate to evaluate a hazard:
exposure via intended use, occupational exposure, and ambient
environmental exposure. For example, food additives are meant to be
ingested, cosmetics are applied to the skin, and drugs are administered
in several forms by several appropriate routes. Humans can also be
exposed to food additives, cosmetics, and drugs unintentionally during
their manufacture and purification; during packaging, transportation, and
storage before their intended use; and during disposal of residues and
wastes. There are few intentional exposures of people to most pesticides
57
OCR for page 58
Select a substance for evaluation
Committee determines whether all the
tests identified as necessary
for the following use or exposure
situations have been done
Exposure by
intended use
Occupational
exposure
Did the tests that were done
follow reference protocols?
1
Environmental
exposure
Yes
Are there factors
that preclude a
health-hazard assessment?
Yes No
No
Is the information
sufficient to allow
a health-hazard
assessment?
r ~
Y;s No
No further testing needed. Document and evaluate
adequacy of information for specific use or
exposure situations and types of tests
Further testing needed.
- Document and evaluate specific inadequacies of information
for specific use or exposure situations and types of tests
FIGURE 3 Outline of procedure for decision-making in evaluating adequacy of
toxicity information on specific substance
58
OCR for page 59
TABLE 5 Information Sought in Exhaustive Literature Search for Each Substance
in Subsample of 100
Information Category
Information
Chemistry Synonyms, trade names, structural formula,
molecular formula, CAS Registry number, purity,
identification and quantity of contaminants,
melting and boiling points, specific gravity, vapor
pressure, particle size, water solubility,
Volubility in organic solvent, complexity of the
chemical species, partition coefficient, pH,
dissociation constant, shelf-life, stability,
potential for undergoing oxidation and reduction,
potential for undergoing hydrolysis under various
pH conditions, photolytic reactivity, absorptivity,
desorptivity
Process Synthetic pathways (chemical origin, starting
materials, stage of appearance in pathways, final
product in pathways)
Production Companies that produce substance, sites of
production, quantity volume (per site total);
percent imported, volume trend
Use Percent produced for commercial use and for
consumer uses, percent degraded, number and kinds
of uses, unintentional release (during storage,
transport, disposal, packaging, manufacture,
industrial use)
Chemical fate Demographic and geographic distribution,
environmental
pathway, environmental stability,
turnover (half-life), degradation, persistence,
partition (in soil, water, air), bioaccumulation,
environmental transport, environmental
bioavailability
Human exposure Routes, form, mode (occupational, consumer, etc.),
number exposed, frequency of exposure, extent of
contact (each episode, total), dose and duration of
dose (each episode, total), rate of absorption
Toxicity Summary of all available toxicity information (see
Appendixes B through G)
59
OCR for page 60
and many other chemicals in commerce, but exposures do occur during
production, distribution, use, and disposal. The term "environmental
exposure" is used to include all potential human exposures other than
those related to the workplace or inherent in the intended use.
The tests that the committee selected to support health-hazard
assessments for substances in various classes of use are listed in
Appendixes B through G. Batteries of required tests from among the 33
test types listed are identified for direct and indirect food additives
(including colors), drugs and excipients in drug formulations (oral,
parenteral, dermal, inhalation, ophthalmic, vaginal-rectal,
over-the-counter, and veterinary), pesticides and inert ingredients of
pesticide formulations, cosmetic ingredients, and other chemicals in
commerce. To the extent feasible, the committee selected tests with
routes of exposure similar to routes of exposure of humans under various
circumstances.
The Committee on Toxicity Data Elements recognizes that duration of
exposure, as well as route, is intrinsically important in the
manifestation and intensity of toxicity in test species and in the
prediction of hazards to humans. It therefore incorporated duration of
exposure--acute, subahronic, and chronic--into its selection of toxicity
tests for predicting hazard. For example, if a substance is believed to
be present consistently in common foods and lifetime exposure of humans
is highly likely, data from chronic-feeding studies are appropriate for
the substance. Similarly, if a substance is likely to be in the
environment of women of child-bearing age, laboratory studies that
investigate possible reproductive/developmental injury are appropriate
for assessing hazards to humans.
During the construction of the dossiers, the quality of a given
toxicity-testing protocol was evaluated without regard for the different
potential uses and different exposure settings of the substance. These
two factors were taken into consideration during later judgments as to
toxicity-testing needs for each substance. In the test summaries,
different measures of quality might have been used for different exposure
settings (intended-use, occupational, and environmental) because the
adequacy of a protocol might vary with the setting (e.g., a protocol
might be considered adequate for low-level environmental exposure, but
inadequate for high-level occupational exposure).
PURITY OF SELECTED SUBSTANCES
Chemical purity is a nonquantifiable variable that must be considered
in each evaluation, and some impurities might have toxicity very
different from that of the selected substance. There are three reasons
for such variability: (1) the names of some substances were not
60
OCR for page 61
clearly stated in the lists or by investigators studying them; (2)
impurities might vary in composition or concentration with different
methods of production or from lot to lot; and (3) some of the substances
selected (e.g., vegetable oils) may contain other compounds (e.g.,
pesticide residues). Although this variability impedes attempts to
attain consistency in judgments of adequacy, it would affect any other
judgments of toxicity equally and might be useful to the extent that it
reflects exposure of humans to similarly contaminated or undefined
substances.
The committee also recognized that exposure is often to mixtures of
substances, rather than to single chemical entities. Mixtures have the
potential for synergistic interactions that potentiate or antagonize the
toxic effects of individual components. Whether special studies of
toxic interactions are necessary for adequate evaluation of health
hazards to humans is a matter of scientific judgment.
In this report, the terms "chemical" and "substance" refer to any
item that appears on any of the lists that constitute the select
universe, although many of these items are not single chemical
compounds. Undefined substances drawn from the select universe
presented problems early in the review of the subsample of 100. Some of
them were chemically so undefined (e.g., "solvent dewaxed, light
paraffinic petroleum distillates") or were so variable (e.g., "zeolites
containing calcium, iron, magnesium, or vanadium") that they could not
be evaluated according to the established procedure. The statistical
analyses and estimates reflect this procedure and inferences from the
subsample apply strictly only to better-defined substances. This
limitation does not apply to inferences from the sample.
GUIDELINES FOR ASSESSING THE QUALITY OF
INDIVIDUAL STUDIES
BASIC CRITERIA FOR SCIENTIFIC METHODS
The Committee on Toxicity Data Elements believes that it is not
appropriate to judge the adequacy of past and future studies solely by
matching them against protocols that are considered acceptable today.
The committee suggests that a study be considered adequate for use in a
health-hazard assessment if it meets the following basic criteria:
· All elements of exposure are clearly described, including
characteristics of the substance's purity and stability, and dose,
route, and duration of administration.
· Results in test subjects are predictive of human responses and
test subjects are sensitive to the effects of the substance. In
toxicity tests of a substance involving several species, data obtained
61
OCR for page 62
with the most sensitive species are often used for making health-hazard
estimates. This is often a conservative approach. When metabolic
activation is necessary to produce toxicity and there is evidence that
the metabolic pathway in the most sensitive species is different from
that in man or the target species, results in a species with metabolic
pathways similar to those of man should be given particular
_
consideration.
· Controls are comparable with test subjects in all respects except
the treatment variable. Depending on the study, appropr late controls
may be positive, negative, or historical. Historical controls, however,
rarely meet this criterion.
· End points answer the specific question addressed in the study
and observed effects are sufficient in number or degree to establish a
dose-response relationship that can be used in estimating the hazard to
the target species.
· Due consideration in both the design and the interpretation of
studies must be given for appropriate statistical analysis of the data.
Although these criteria do not capture all potentially critical
aspects of scientific judgment, the available data on a given substance
may be considered of adequate quality if tests have been performed and
reported according to these basic scientific principles. Several
additional factors, although not often critical in deciding whether a
given test is adequate, are highly desirable and should be taken into
account:
· Subjective elements in scoring should be minimized. Quantitative
grading of an effect should be used whenever possible. Sometimes, this
is not feasible, as when pathologists attempt to judge the nature and
extent of a malignant neoplasm. Such evaluations depend on the
_
experience and training of the pathologists.
· Peer review of scientific papers and of reports is desirable and
increases confidence in the adequacy of the work.
· Reported results have increased credibility if they are supported
by findings in other investigations.
· Similarity of results to those of tests conducted on structurally
related compounds increases credibility.
· Evidence of adherence to good laboratory practices improves
confidence in the results.
SELECTION OF REFERENCE PROTOCOLS
The quality of individual toxicity tests may be assessed by
answering the question: Does the quality of the information permit a
62
OCR for page 63
health-hazard assessment that is acceptable? In recognition of the need
for accepted and reproducible standards, the committee chose as its
first step in the qualitative evaluation of toxicity data on a given
substance a comparison of the study with a reference protocol. Because
a requirement for inclusion of the substances in the subsample was the
existence of minimal toxicity information, there were no selected
substances without some information for the assessment of the quality of
testing protocols. However, for each substance in the subsample, some
toxicity information was missing or some data were derived from studies
that did not meet the reference protocol guidelines. A comparison of
available tests with reference protocols, combined with the judgment of
the committee relative to the basic criteria of scientific methods,
enabled the categorization of substances with respect to the quality of
toxicity-testing protocols.
In selecting reference protocols for judging the quality of
individual studies, the committee used various resource documents on
short-term and long-term toxicity testing, with emphasis on those
constructed through national and international collaborative efforts.
The committee identified the reference protocols of the Organisation for
Economic Co-operation and Development (1979, 1981), the Interagency
Regulatory Liaison Group (1981a, 1981b, 1981c, 1981d, 1981e), and the
National Research Council (1975, 1977a, 1977b, 1980) as the most
appropriate in this regard (see Appendix H.). It should be understood
that it was not the committee's intent to endorse any particular test
protocol. Rather, on a pragmatic basis, particular tests were selected
as appropriate for judging the adequacy of testing of chemicals.
Although over-rigid protocols are impractical, the reference protocols
provide descriptions of standard test methods with sufficient detail to
establish a basis for sound study design while permitting flexibility
where scientific judgment was advantageous. The committee used the most
current documents, sometimes with changes or additions based on its own
judgment, as presented in Appendixes I through K. The committee
believes that these modifications and additions will be useful for
future development of a data base for heath-hazard assessment. A
published document describing each modified test system is cited in
Appendix H.
Not every toxicologist might agree on every detail in the
guidelines, but only reference protocols widely reviewed and generally
accepted were used in this study. The list is not intended to reflect
the attitudes or practices of regulatory agencies.
Because some toxicity reports did not contain terminology directly
compatible with the specifications of the reference protocol guidelines,
it was often necessary to make judgments on whether the study adequately
followed the guidelines. In general, these judgments were relatively
easy to make and engendered little or no controversy within the
committee.
63
OCR for page 64
Reference Protocol Guidelines for Neurobehavioral-Toxicity Tests
The committee recognized that the neurotoxicity-testing protocols
developed by the OECD (Organisation for Economic Co-operation and
Development, 1981) are appropriate only for evaluating the neurotoxicity
of organophosphorus compounds. These protocols cannot be used to
evaluate mammalian neurotoxicity for other substances, nor are they
appropriate for studying functional behavioral changes produced by
substances other than organophosphates for which no specific neural
lesion has been identified. The OECD expert group on neurotoxicity also
recognized this matter and, at its meeting in April 1982, took two
actions: it changed the titles and scopes of the neurotoxicity tests
proposed in the OECD guidelines to reflect their applicability only to
organophosphorus compounds, and it recommended the development of
guidelines for more general neurotoxicity testing. There was a
consensus in the OECD group that neurotoxicity testing should include
..~;~1 ~=ha`F;^r=1 ==c-.c~m~nt.n outside the laboratory holding
facility and neuropathologic examination of various neural tissues after
in situ perfusion. me Committee on Toxicity Data Elements agrees.
~ . . ~ ^ in. . ~ _ i_ . . ~ . _ _ _ _ ~
For delayed-neurotoxicity tests of organophosphorus compounds, the
committee used previously established reference protocols. For other
classes of compounds, a detailed protocol for neurobehavioral-toxicity
testing has not been completed and approved by OECD. Therefore, the
committee adopted for its own interim use an alternative set of protocol
guidelines that have attained some degree of general acceptance in the
scientific community (Appendix I).
Reference Protocol Guidelines for Genetic-Toxicity Tests
After the start of this study, the OECD drafted guidelines for 10
genetic-toxicity tests. These were later adopted by the committee. The
lo tests were the Ames Salmonella/liver microsome reverse-mutation
assay, Escherichia cold reverse-mutat~on assay, rodent micronucleus
assay, in vitro chromosomal-aberration assay in mammalian cells,
sex-linked recessive-lethal assay in Drosophila melanogaster, forward
gene-mutation assay in mouse lymphoma L5178Y (TKT' ) cells, rorwara
gene-mutation assay in Chinese hamster ovary (HGPRT) cells, forward
gene-mutation assay in Chinese hamster V79 (HGPRT) cells, in viva
chromosomal-aberration analysis in rodent bone marrow, and rodent
dominant-lethal assay. The committee also adopted a policy of judging
the testing protocol of each genetic-toxicity study for its adequacy and
then of judging the overall adequacy of all genetic-toxicity protocols
for- a given substance according to the requirements described in
Appendix J.
64
OCR for page 65
PROCEDURES FOR EVALUATION OF THE DATA BASE
INITIAL CONSIDERATIONS
Existing information was evaluated against two sets of criteria to
judge its quality and completeness. The first set was a series of
reference protocol guidelines that have received widespread review and
general acceptance. This array of protocols was selected not as the
most reliable and efficient group of tests, but rather, by convention,
as the best available for chemical-safety assessments. The second set
of criteria was based on the accumulated experience and expertise of
committee members, whose combined judgment was used to determine the
adequacy of an individual study if it did not meet the reference
protocol guidelines.
The second set of criteria was established by the committee in the
expectation that the data bases of only a few substances would meet all
the requirements of the reference protocol guidelines, partly because
much toxicity information was generated before the guidelines were
developed. The committee expected that sufficient data might often be
available for evaluation, even though some toxicity information would be
missing and some data would be derived from experimental designs other
than those prescribed in the reference protocols. Therefore, the
committee intended that its determination of the adequacy of
toxicity-testing data for conducting a health-hazard assessment would be
based sometimes on information derived from experiments that followed
the reference protocols and sometimes on other information that met the
committee's own subjective criteria for evaluating scientific methods.
Using this combination, the committee assessed the adequacy of the
toxicity-testing protocols for all chemicals in the sample.
me committee felt that the evaluation of toxicity data bases to
predict hazard to human health must be approached with caution and
flexibility. In general, data from properly conducted animal studies
are often predictive of the degree of hazard to humans; however, for
individual substances, such laboratory investigations may be misleading
with regard to target organ, potency, or type of effect. Thus, expert
judgment to ensure the proper use of all available data is an essential
part of each analysis. For example, the metabolism of a toxicant may
differ between test species and humans in ways that produce
false-negative or false-positive results with regard to possible human
hazard. The appropriate test battery may be incompletely performed, but
there may be other data, such as extensive information on the mechanisms
of action in several species, to obviate a need for additional tests.
And data from human studies, both epidemiologic and clinical, may be
essential in deciding whether to conduct a test on a substance merely
for the purpose of completing the recommended battery of tests for that
substance. For example, there may already have been human studies and
exposure of sufficient breadth and sensitivity to reduce the need for
toxicity studies in laboratory animals, or clinical studies may have
65
OCR for page 70
time. Once the data were obtained, the method and depth of their
evaluation by the committee and the consistency of judgment had to be
determined. The judgments made by the committee were coded and recorded
in preparation for subsample analysis and extrapolation to the select
universe.
The committee recognized that the quality of its comprehensive
literature search and its detailed evaluations of the data bases would be
the most important determinants in estimating testing needs in the select
universe. Recognizing that this was a large and important task, the
committee established five working panels, each with a designated leader
and two or three other committee members. Assignments were not
considered effectively random, because several substances were selected
by panel leaders who were familiar with the substances' toxicity data
bases. Remaining substances in the subsample were assigned in rotation
among the five panels (20 each), including substances from each of the
seven categories. Each group then had responsibility for reviewing the
data bases. At a series of planning meetings, the panel leaders
collectively established standard procedures for data review and
evaluation and developed practices to ensure consistency in
decision-making.
The review process was time-consuming. The committee recognized
early in the second year that it could not carry out the entire review
itself on a volunteer basis. Therefore, to expedite the process, the
initial phases of the review were carried out by NRC staff and
consultants. It is estimated that these initial phases required about
1.5 scientific person-years of effort.
The procedure for the review of the data base on a substance
consisted of the following steps:
· Each document was individually reviewed and compared with the
appropriate reference protocol.
o A summary sheet was prepared for each document, outlining the
pertinent details of the protocol, assigning a preliminary ranking for
the quality of the study and reasons for this judgment, stating the
nature of the document (abstract, review, etc.), and stating which of the
prescribed 33 test types was (were) reported in the document.
· In many cases, the quality of individual study protocols was
determined by individual panel members who had applicable expertise. All
reviewers were required to document their findings and to provide reasons
for them based on the criteria established by the committee. Such
documentation was especially important when a reviewer had intimate
knowledge of a substance.
The dossier prepared by NRC staff and consultants, including these
judgments on individual report, was reviewed by the appropriate panel
leader, who then presented it to the panel members for review and
modifications deemed necessary. Twelve panel-approved dossiers were
discussed by the entire committee to ensure that there was
70
OCR for page 71
concurrence in the approaches used. The other 88, after review by a
panel's leader and members, were reviewed by a subcommittee of at least
five designated committee members. Important issues concerning any
dossier were placed before the entire committee. Otherwise, judgments of
the subcommittee concerning review of dossiers from the panels were
regarded as final. The above process ensured that each Recision with
regard to the quality of every study was reviewed at three levels: by a
panel chairman, the panel's other members, and either the entire
committee or its designated review subcommittee. The relevant data base
was present or easily accessible at each step of this multistage review
process; that allowed the panel leader, the members of each panel, the
subcommittee, or the committee to conduct an independent review of the
original material when any person deemed it necessary.
The committee recognized the need to maintain uniformity and to
ensure quality in the review of documents. Standardized procedures for
documenting decisions regarding data adequacy provided quality control
for decision-making. Variations in the consistency of decisions were
reduced first by judging a study's adequacy against the uniformly applied
set of reference protocol guidelines. These standards were used for
studies of the same test type across all substances and by all persons
making the Judgments. In effect, all reviewers were making measurements
with the same yardstick. Deviations from the guidelines were then noted
according to the scheme presented in Table 6, so that one person's
reasons for judgment on a chemical could be examined by others making
similar judgments on other chemicals. To ensure consistency, the five
panel leaders often compared their reasons for judgments on the quality
of protocols.
Because committee members often had to exercise scientific Judgment
when information was inconclusive, it was necessary to provide mechanisms
to document their judgment and to ensure that they remained consistent
and that testing protocols and other information were always judged as
uniformly as possible. The system of multilevel review described above
was designed to reduce the errors and differences involved in the
committee's use of scientific judgment.
The process led to decisions of whether further testing of a given
substance was needed. Before such decisions were reached, the committee
considered the types of exposures to substances likely to be encountered,
their chemical and physical characteristics, their manufacturing
processes, their production volumes, their uses, their chemical fates,
their toxicity in animals, and their potential or known toxicity in
humans. The committee's detailea evaluation of the data base for each
substance in the subsample included determination of the adequacy of each
required toxicity test specified in Appendixes B through G. The
completed dossiers collectively were used as the committee's record to
characterize the subsample.
71
OCR for page 72
The committee analyzed the decisions about the quality, quantity, and
extent of the subsample's toxicity data base to assess the toxicity-
testing needs related to the larger select universe. This extrapolation
was a joint effort of the Committee on Toxicity Data Elements and the
Committee on Sampling Strategies. The tabulations and interpretations of
the evaluated data bases were used as a bridge for applying statistical
inferences derived from the subsample to the select universe from which
it was drawn.
LIMI TATIONS OF THE DATA GATHERING PROCESS
The approach developed to collect data on each substance included
searches of the open literature through automated, on-line data files,
such secondary sources as reference manuals and textbooks, government
technical reports, files of the regulatory agencies (where available),
and files provided by some chemical manufacturers and trade
associations. The data obtained from the searches of the primary ana
secondary open literature accounted for the bulk of the information in
the dossiers. Search strategies were carefully developed to ensure the
most efficient screening of the selected data bases. However, some of
the data bases failed to include the most recent research.
The degree of accessibility of government agency files to the
committee varied. Some information was obtained from the regulatory
agencies, and several research reports from military sources yielded
useful information, especially on exposure of humans. At times,
confidential data were made available to selected NRC staff members or to
specific committee members. However, nonconfidential health and safety
data embedded in commercial confidential files possessed by the FDA
Bureau of Drugs were unavailable (see Appendix M for further detail).
Responses from manufacturers and trade associations were also mixed.
A few manufacturers were extremely cooperative in providing information
that supplemented the open literature; however, the total amount of
information from this source was relatively small. The committee
believes that some relevant but unreviewed toxicity information,
especially of a confidential nature, exists in the files of
manufacturers.
Evaluations of the 100 substances in the subsample therefore were
based largely, but not exclusively, on published data or other publicly
available information, which may be somewhat short of the amount and
diversity of data contained in the confidential files. The absence of
specific information from the dossiers reflected both the inaccessibility
of some data bases and the lack of relevant testing. Data not available
for the committee's confidential review are presumably not available for
legitimate review by other interested parties; hence, in an operational
sense, they do not exist.
72
OCR for page 73
Most of the exposure estimates were based on intended uses and
knowledge about products that contain the materials of interest. Some of
the occupational- and environmental-exposure estimates were based on
production volumes, environmental fate, and disposal data, but few data
of these types were available.
Other kinds of information that were rarely encountered concerned
production trends, production processes, and percentage of total
production allocated to each intended end use. More information of this
type would have contributed greatly to estimates of exposure. Again, it
was assumed that much of this information exists, but access to it was
limited or restricted. The committee found little or no epidemiologic
information or information on environmental fate (e.g., biodegradation
and bioaccumulation) for most compounds in the sample.
The data base was limited by the paucity of information on toxic
effects in humans. Because observational studies on humans are expensive
and involve special difficulties, they cannot be undertaken routinely.
Even if extensive resources became available, it would be impossible to
acquire conclusive data on many possible outcomes under all different
conditions of exposure. Epidemiologic studies involve factors that are
different from animal toxicologic protocols. Investigators must know not
only the chemical and physical properties of substances and the
quantitative and qualitative toxicity data from studies in animals, but
also the extent of human exposure, its intensity, and other qualities of
exposure that are needed to conduct an adequate study. It is necessary
to define pathologic end points or effects, define a control population,
conduct followup or retrospective studies, ensure that there is a
suitable exposed population with enough exposed subjects to provide
reasonable statistical power, and develop mechanisms to minimize or
quantify sources of confounding or bias. Sometimes, epidemiologic
studies in different settings have each contributed information to
increase the credibility of a cause-and-effect relationship, but are
flawed because exposure to the substances under study and exposure to
some other possible cause of the same end point have occurred
simultaneously. For these reasons, the committee did not assess the
adequacy of most observational studies of humans, but it did consider
information from case reports citing adverse effects in humans and, for
some classes of substances, data from human sensitivity tests and
available epidemiologic information.
Extensive data might exist on human exposure to some substances
(e.g., drugs) with intended uses limited to a few exposures in a
lifetime. Limited toxicity testing might be adequate for such end uses,
but insufficient for developing safety standards to protect industrial
workers producing or using the materials or medical personnel who might
handle them frequently in the course of their professional activities.
Very few data on potential occupational or environmental exposure were
accessible.
73
OCR for page 74
For a substance to be reviewed, it had to be well defined, readily
characterizable, and identifiable. Thus, some large classes of
substances (such as plant products, minerals or ores, and unidentified
mixtures) were excluded from consideration in the subsample. They were,
however, included in the sample.
Three deficiencies of the TSCA Inventory as a source of chemicals in
commerce were most apparent during sample selection and evaluation:
· Poorly defined chemical mixtures in the sample were not
sufficiently uniform in composition or could not be sufficiently
characterized to determine the extent of toxicity testing performed, much
less its adequacy.
· Some substances, according to manufacturers, were no longer in
production in 1977 and therefore were no longer "chemicals in commerce."
· Hany companies listed in the Inventory as manufacturers of
chemicals in 1977 claimed never to have mace those chemicals, although in
some cases they hat made related substances.
Therefore, there was little information on many chemicals in commerce,
and it was often impossible to determine whether a substance had minimal
toxicity information.
Substances are often selected for toxicity tests because there is a
particular interest in them (e.g., because some toxic effect has been
observed). Thus, selection of substances that have already had some
testing must not be considered a random sampling of al1 substances in the
select universe.
INTERPRETATION OF DATA ON TESTING QUALITY
Although the presence of toxicity information on each of the 100
substances in the subsample and the reviews of that information may have
some intrinsic value, the reviews were not intended to provide specific
information concerning the need for additional testing of these specific
compounds. They served only as a basis for inference about the select
universe and the seven categories of substances in it. The substances in
the select universe are themselves a nonrandom sample of all substances
in the entire universe of chemicals known or used at a specific time
Mid-1981. Because of the manner in which toxicity testing can be
conducted and has been in the past, some groups of substances were not
included as specific classes among the substances in the select
universe. Examples are some natural products of largely undefined nature
or chemical structure, some mixtures of chemicals, and some industrially
used chemicals of variable composition.
74
OCR for page 75
Furthermore, the review reported herein was not undertaken as an
"audit" of the adequacy of past regulatory policies or procedures, and
the data developed in this study are not likely to be immediately useful
for regulatory purposes. For example, some regulatory decisions might
have been based on information developed from long-standing exposure data
not available to the committee during its evaluation of the toxicity data
base. It would be inappropriate to judge past decisions against current
standards of toxicity evaluation. The committee also recognizes that
regulatory standards must be set in accordance with law and federal
agency policies. These standards are based on more than toxicity data,
and regulations, once set, cannot be lightly or easily changed on the
basis of a modest increase in information about effects. In some cases,
regulatory decisions are based on proprietary information that is
available only to the concerned agencies.
CHARACTERIZING THE SAMPLE AND OPTIONS FOR DRAWING INFERENCES
TO THE SELECT UNIVERSE
In the second year of operation, the Committee on Toxicity Data
Elements worked with the Committee on Sampling Strategies to identify
encoded data in the dossiers that were critical to analysis of the sample
and to extrapolation of the analysis to the select universe. Some of the
basic critical data in the dossiers described the toxicity tests
conducted and their quality. Descriptions of this nature--combined with
information on intended use, physical and chemical properties, and
potential exposures--were tabulated by the two committees in the third
year of the study. The tables serve as springboards to more detailed
analyses of the categories in the sample. Algorithms for these analyses
were developed by the Committee on Sampling Strategies.
The select universe was sampled in two phases. First, independent
systematic random samples were drawn from each of the seven categories.
The components of each sample were arranged in random order and then
examined one by one for the existence of minimal toxicity information (as
defined for each of the five intended-use classes, described earlier)
until a specified number of substances with at least minimal toxicity
information were identified. The substances were not reviewed further if
the literature search uncovered less than the prescribed minimal toxicity
information or if the substances were so ill-defined as to preclude
evaluation. The latter group included substances in the select universe
whose names referred to sets of substances (e.g., "alkyl derivatives of
dimethylbenzylammonium chloride") with possibly different toxic
properties, on which it would be impossible to characterize the quality
~ ~for this reason
or toxicity information. The substances set aside
remained in the sample and are an important part of the data base for
evaluating the extent to which substances meet the minimal-toxicity-data
criteria defined by the Committee on Toxicity Data Elements.
75
OCR for page 76
CONSTRUCTION OF TABLES FOR ANALYSIS
By examining data in both the sample and the subsample, the Committee
on Sampling Strategies could obtain an estimate of the amount and quality
of information on toxicity testing of substances in the select universe.
m e process of examining these substances generated other information of
interest in the examination of toxicity testing. Some of these data,
such as the frequency with which a given toxicity-test type could be
found for each of the 100 substances in the subsample, were available
from machine-readable files. Other information, such as the quality of
reporting of toxicity tests or the ways in which the Committee on
Toxicity Data Elements determined the adequacy of a given toxicity test,
was not suitable for statistical analysis, but is presented in a
qualitative form in the conclusions and recommendations.
STATISTICAL ANALYSI S OF DATA
The dossiers on substances in the subsample provided sets of
tabulations, measures, and various descriptive items of information that
the Committee on Sampling Strategies used to estimate the testing
adequacy for substances in each of the seven intended-use categories of
the select universe or in other well-defined sets of substances in that
universe.
ESTIMATES BASED ON THE SAMPLE ALONE
To estimate the properties of substances in a well-defined category
of the select universe and the variances of those estimates for the
sample, the Committee on Sampling Strategies adopted procedures that
allowed for the use of all available information on substances in any
well-defined subset of the select universe. All substances in the select
universe could be placed in different combinations of the original seven
categories. For any substance, j, it is straightforward to determine the
probability, pj, of selection into the sample. Because samples were
drawn independently from each of the seven categories, i, the probability
1 ~ ~ ~ ~ ~ 1~
- pi, of not being selected is precisely the product or the
probabilities of not being selected from any category, so that if Pij
is the probability that substance j is selected from category i, then
7
pj 1 ~ =1 ( 1 Pij)
76
(1)
OCR for page 77
Note that, if substance j is not a member of category i, then Pij = 0,
so the value of pj is unaffected by that category.
The subcategory to which any substance belongs is defined by the set
of categories of which it is a member. There are 64 possible
combinations of categories (2 x 2 x 2 x 2 x 4), i.e., 63 possible
subcategories with the exclusion of the one classification of chemicals
that are not in the select universe because they do not fall into any of
the categories. Some of the 63 possible subcategories may, of course,
contain no substances from the select universe, and other subcategories
may include substances from the select universe but none from the sample.
For any specific analysis of the sample, the subcategories of
interest are first determined. For example, an analysis of all
substances on the list of drugs and excipients in drug formulations
could include as many as 32 subcategories defined by being on that list
but on or off the lists of pesticides and inert ingredients of pesticide
formulations, food additives, cosmetic ingredients, or chemicals in
commerce. Similarly, an analysis of the entire select universe from
which the sample was drawn would include up to 63 subcategories.
In this discussion,
h = subcategory (h = 1,2, . . ., 63)
and Nh = number of substances in subcategory h.
Let H be the collection of subcategory members, h, that are of interest
for a particular analysis. Let xh be the mean (here, the proportion) of
some property of the sample substances in subcategory h. Then an unbiased
estimate of that proportion over the whole set of subcategories, H. is
~ Nh
H N xh'
where N = ~ Nh, and the variance of this estimate is
x H x~
(2)
. (3)
Unfortunately, this ideal formula is not usable in practice, because
limited resources precluded exhaustive searches for duplication of substances
among categories, so the values for Nh are not precisely known. Information
on category-to-category duplication is, however, available for all items in
the sample and, for all compounds, j, that were actually found in subcategory
h, can be used to estimate Nh as
~1
N = ~. (4)
h josh Pj
77
OCR for page 78
Replacement of Nh/N by the estimate Nh/ INh introduces an additional
source of variation in x, in that, where ~ is the mathematical expectation
value,
Var (x) = E Var (x|{Nh}) + Var E (x|{Nh}) e
(5)
However, the committee believes that the second term in Equation 5 is likely
to be small. Because, in all cases presented in the tables of this report,
Xh (and hence ~) is the estimated proportion of substancesAwith a specific
property, when nh is the sample size in subcategory h and Ph = Xh is the
proportion observed in subcategory h,
a- = Ph(1 - oh)/
x,
. (6)
Equation 6 assumes that the selection was essentially equivalent to a simple
random sample, as discussed previously. Thus, the computing formulas that the
Committee on Sampling Strategies used for estimates based on the sample are
~ H ~-^ 1 h
_E Nh ~
and r 2
~ H NAh . ~
me sample sizes for any category are sufficiently large, with the
estimated proportions not too near O or 1, that the sample distribution for a
category is approximately normal. Therefore, 90% confidence intervals are
presented--that is, intervals with at least 90% probability of including the
true value of the proportion, estimated as p- 1.65 J~ and ~ + 1.653~.
ESTIMATES BASED ON THE SUBSAMPLE ALONE
Most of the tables presented in this report give estimates of the
proportions of substances in a category that have specified characteristics.
Such estimates are based on the sample of substances selected from the
corresponding segment of the select universe. Some substances belonging to a
category do occur in the subsample selected from other lists and satisfy the
minimal-toxicity-information screening criterion for the specified category.
It would have been possible to use these additional sample substances in
making the estimates; that would probably have resulted in somewhat smaller
sampling errors. However, because the screening procedure was not identical
78
OCR for page 79
for each intended-use category, it would have introduced biases into the
results. The Committee on Sampling Strategies therefore decided to base
estimates for each category solely on the sample selected for that category.
This limitation of analysis further implies that it is not valid to use the
results given here to derive estimates across categories in the final sample.
Such combined results can be calculated, but would require the preparation of
special tabulations.
The probability of selection within a category is constant, so an unbiased
estimate of the proportion of screened substances in the ith category that
could be evaluated and that have a given characteristic is pi = xi/ni,
where ni is the sample size and xi denotes the number in the sample that
have the characteristic. If Pi denotes the true proportion, the
statistic U; has the Bernoulli distribution with parameters Pi, nit It is
then possible to calculate two numbers, Li and Ui, as functions of ~ and
ni, so that the probability that Li ~ Pi < Ui is at least 9 0% . These
are the conf idence limits shown in the tables.
ESTIMATES BASED ON BOTH THE SAh1P LE AND THE SUBSAMP LE
Some estimates in this report make use of information provided by both the
sample and the subsample. Such estimates are for the proportions of some
category in the select universe that both satisfied the screening criteria and
have one or more additional specified characteristics. The point estimate of
such a proportion is the product of two factors that are statistically
independent as a consequence of the sampling procedure. The first factor is
the estimated proportion that satisfies the screening criteria for the
category and is based on the sample alone. As statea above, its distribution
is approximately normal, and its variance has been estimated from the sample
results. m e second factor is the estimated proportion, among substances that
satisfy the screen, that also have the specified characteristic; it has a
Bernoulli distribution, and lower and upper confidence limits have been
calculated.
Confidence limits for the product of the two factors were approximated as
follows. Let x denote the first of the two factors referred to in the
preceding paragraph, and y the second. Where X and Y denote the mathematical
expectations of x and y, respectively, note that the variance of the product
of independent variates x and y is given by
2 2 2 2 2 2 2
= ~ ~ + X ~ + Y
xy x y y x
(9)
Because the distribution of y is asymmetric in general, estimates of the lower
and upper confidence limits are computed separately. To calculate the lower
confidence limit, foxy in Equation 9 is calculated by replacing by with
(y - ~/1.65, where Ly denotes the lower limit for Y. and X and Y in
Equation 9 are replaced with their estimates x and y. The lower limit for the
79
OCR for page 80
product is then calculated as xy - 1.6S ~y. For the upper limit, By in
Equation 9 is replaced with (Uy - y)/1.65, where Uy denotes the upper
limit for Y. The upper limit for the product is then calculated with this new
estimate of oxyas xy + 1.65 xy
The very substantial costs and demands required to amass and analyze data
on 100 substances in a short time permitted a subsample of only 100 substances
for the seven categories of the select universe. As a result, confidence
-
intervals are large. Any analysis of the data should be based on an awareness
of the limited statistical precision of results from the subsample.
MACHINE-READABLE FI =S
To facilitate data analysis, information on the substances in the sample
and subsample was assembled in machine-readable files. The presence or
absence of the five types of toxicity tests in the minimal-toxicity-
information screen (acute, subchronic, chronic, reproductive/developmental,
and mutagenicity) was tallied. In addition, the entire list of substances in
the sample was scanned to determine which of the seven intended-use categories
contained each substance. These 12 items and a numeric identifier for each
item in the sample were entered into the computer for analysis.
Dossiers compiled on substances in the subsample contained substantially
more items of information than were available on the sample. These items were
tallied to provide measures of the amount and adequacy of data available on
substances in the subsample.
The seven intended-use categories were expanded into partially overlapping
subcategories, as listed in Appendixes B through G. A more complete roster of
test types was available, and the test protocols for each test type deemed
necessary for a substance's designated subcategories of intended use were
evaluated for adequacy. Chemical and physical properties of each substance
were sought, as well as its manufacturing process or processes, production
volume, potential for exposure, and environmental fate. Overall judgments,
such as the ability to assess the potential hazard to human health, were
determined. Although much of this information was already in tabular form,
some items in the dossier were descriptive. This type of information,
primarily nontoxicologic, was intended to assist in assessment of potential
for hazard. Although the presence or absence of this information was recorded
for later numeric analysis, no judgment of the quality of the nontoxicologic
data was made.
80
Representative terms from entire chapter:
toxicity data