Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 86
5
Evaluation of StatisticalData Analysis
MODE OF PRESENTATION OF RESULTS
The basic objective of the statistical analysis was to
determine whether thyroid disease has increased among persons
exposed to ]3~l released from Hanford in the period under
consideration. That objective was appropriately addressed by
modeling the relationship between dose and the probability of
occurrence of a thyroid disease. in particular, the relationship was
modeled as a linear function that is, a regression equation by
using the median of the 100 dose estimates for each person as his
or her assumed dose. The HTDS investigators used these linear
models in dose to mode} probabilities of disease and the numeric
values of blood concentrations of various biomarkers. The use of
linear models (or linear-quadratic models) for probabilities is
common in radiation epidemiology and is generally based on
biologic or radiation-protection considerations. Such models,
however, can present difficulties in estimation, in that negative
probabilities are not allowed to occur but can appear during the
iterative procedure used to produce maximum likelihood estimates,
especially if a generally negative dose-response relationship is
evident in the data. In such cases, the models are said to have had
problems in "converging". The HTDS investigators also used
another approach: logistic regression. Logistic regression is a
nonlinear mode! in which probabilities are never allowed to reach
zero, so convergence problems are less common during maximum
86
OCR for page 86
Evaluation of Statistical Data Analysis
87
likelihood fitting. However, the parameter estimates from logistic
regression are arguably less easily interpretable in the radiation-
biology or radiation-epidemiology setting because Tog odds ratios?
not disease probabilities, are being modeled as linear in dose. The
HTDS investigators used the linear models as their primary
method of analysis but also gave the logistic-regression results,
especially when the linear models failed to converge. In their
analyses, convergence apparently was always achieved with the
logistic models, but many of the linear models had convergence
problems. The HTDS investigators clearly considered the linear
models as the primary method of analysis; for example, power
calculations were given only for linear models (section V). We
focus most of our comments on their use of the linear models, but
much of our critique is also applicable to logistic regression.
A zero slope in the regression equation indicates no
association between dose and the probability of occurrence of a
thyroid disease. Standard statistical tests were used to determine
whether the slope was significantly positive. Because the
investigators assumed that the association, if any, would be in a
positive direction, they appropriately used a one-sided statistical
test. For most of the thyroid diseases considered in the HTDS, the
conclusion was that the null hypothesis of zero slope could not be
rejected, that is, there was no clear evidence of a thyroid-disease
effect due to exposure.
In the HTDS Draft Final Report, there was an
overreliance on the maximum likelihood fitting of the linear dose-
response model. For several of the important outcome variables,
such as thyroid carcinoma, the mode! calculations failed to
converge. A better organization of the results would have been
achieved by expanding the tables of high- versus low-dose results
(section VITI, pages 107-124) into quartiles or quintiles, so that
disease and abnormality rates would be given for four or five
categories of dose, and incorporating them into the presentation of
the earlier results. In cases where the results of the linear mode!
calculations as described failed to converge, it would then be
OCR for page 86
88
Review of the HTDS Draft Final Report
possible to rerun the analysis, using the average value of dose in
each category as the predictor variable. That would probably have
resulted in successful convergence and would retain reasonable
power to detect an effect. Another approach would be to replace
the maximum likelihood fitting with an ordinary least-squares
analysis when the mode} failed to converge with maximum
likelihood. Unless a large proportion of the estimated probabilities
lie outside the unit interval (0,1), the slope test statistic is
reasonable for testing for the presence of a dose-response
relationship. For models in which convergence was still not
obtained, it would be reasonable to report the value of the slope
parameter at the point where the constraint that all outcome
probabilities be positive was first violated. A confidence limit
based on the profile likelihood for the slope parameter could have
been calculated and would have been helpful, especially for
comparison with other studies.
Rather than reliance exclusively on analyses that used
the putative individual doses, an additional set of confirmatory
analyses would be valuable. The basic parameters that define a
person's dose are geographic location, source of milk (backyard
cow, commercial milk, and so on), and amount of milk consumed.
Analyses of thyroid-disease rates according to those basic dose-
related variables would provide assurance that doses were not
seriously misestimated and could further confirm (or contradict)
the principal negative results.
Some results were presented in an abstract, rather
uninformative manner. For example, there was a scatter plot of
individual thyroid doses on a logarithmic scale but no table of
frequency distribution of doses, which would have been much
more useful. Similarly, one expects radiation-epidemiology reports
to include tables that show observed and expected numbers of
disease outcomes according to dose groups; these key tables were
absent from the Draft Final Report.
.
Too little descriptive material supporting the results of
the analysis was presented. A description of the estimated dose
OCR for page 86
Evaluation of Statistical Data Analysis
89
distribution (distribution of median doses for people in the study)
and disease frequencies and prevalence according to such
important categories as sex, geostratum, year of birth, and amount
of milk consumption in childhood would be helpful, especially in
interpreting the finding that people in the least exposed geostratum
appeared to have the highest rates of many of the thyroid diseases
or abnormalities.
TYPES OF ANALYSES AND RELATED DOSIMETRY-ERROR ISSUES
The mode! of the dose-response relationship was given
in the HTDS Final Draft Report (section VIT, page 10) as
Pj649 = Aj + By,
where
= sex'
a1= cumulative dose to thyroid,
Pj6~) = probability that person of sex j and dose ~ has disease or
condition in question,
Aj= baseline risk frisk without radiation, which can depend on
sex), and
B = regression coefficient on dose (the slope of dose-response
regression line).
There is sizable uncertainty in the doses reconstructed
for individuals based on residential and especially dietary histories,
and variations related to source term, meteorologic uncertainties,
pasture deposition, milk concentrations of }3~{, source of milk, and
OCR for page 86
9o
Review of the HTDS Draft Final Report
iodine metabolism need to be taken into account. It seems clear
that analyses need to address those uncertainties explicitly and that
the confidence intervals and the strength of the conclusions have to
reflect them. That implies assumptions pertaining to the
distributions of two error terms, En and E2:
Pj~d[) = Aj + B (~1 + Eli + E2,
where
, = error in estimating doses, and
= error in response to given dose.
The statistical-methods section (section VTT) of the
Draft Final Report described a mode! incorporating uncertainties in
dosimetry, but the analyses (section VIII) used a simpler model
that did not include dose uncertainties. Furthermore, no
assumptions were stated for E2, and little attention was given to the
uncertainty represented by it.
As described in chapter 6, below, it appears that
dosimetry-error issues were not fully treated in the analysis of the
power of the HTDS. Ignoring dosimetry errors can lead to
unrealistically narrow estimates of the confidence limits that are
applied to the estimated parameter values. The statistical-methods
section does describe a method for computing likelihood-ratio
(LR) statistics when (as is true for the HEDR doses) errors in the
close estimates for each individual are correlated (section VIl.C.3~;
however, this method is not used in the results section. The
suitability of the FR method that the investigators presented to
account for dosimetry errors depends on the validity of the
Berkson mocle! for errors (see chapter 6) and on accepting that the
correlations between dose estimates are fully specified by the
HEDR simulations. Despite those cautions, results from the ER
approach could be useful. Although it is very unlikely that the
OCR for page 86
Evaluation of Statistical Data Analysis
91
estimated dose-response relationships would change in an
important way, confidence intervals that take dosimetr~r errors into
account would provide more appropriate information about the
uncertainty of estimates. It is recommended that confidence
intervals be calculated.
ANALYSIS OF POTENTIAL CONFOUNDING OR EFFECT-MODIFYING
VARIABLES
it was not clear from the Draft Final Report how
confounders of dose-response relationships were treated, and
results adjusted for possible confounders were rarely given. The
HTDS investigators conducted analyses of the venous thyroid-
disease end points to evaluate a number of possible risk factors for
confounding effects or effect modification (section VIlI.D.20) but
presented no tables to show a summary of the results of their
analyses for any of the end points. Several such tables should be
added to the report.
ASSUMPTION OF EQUIVALENT RADIATION EFFECT FOR MALES
AND FEMALES
An important assumption in the main analyses was that
rates of thyroid disease might differ between sexes in the absence
of 13~{ but that the radiation effect (which was calculated as an
excess absolute risk) would be comparable~for males and females.
A number of studies have found that excess absolute risks posed
by external radiation are greater for females than for males with
respect to thyroid cancer (Ron and others, ~995) or thyroid nodules
(Nagasaki and others, 1994; Ron and others, 1989; Wong and
others, 1996), so the assumption of comparability between sexes is
a key assumption to be tested. The investigators stated that they
tested it but presented no results for the reader to examine with
respect to this assumption for any of the disease outcomes. An
analysis that allows for differences between sexes in dose-response
slopes should be presented.
OCR for page 86
92
ANALYSES BY AGE AT EXPOSURE
Review of the HTDS Draft Final Report
The prevalence of thyroid cancer induced by radiation
depends heavily on age at exposure, so it would have been helpful
to see a table showing dose-response analyses for those who were
younger versus older in 1945, the time of the greatest ~3~:
irradiation, to examine whether there were indications of a
radiation effect among those exposed at the lowest ages. In
particular, it is recommended that results be presented for those
exposed in utero and during the first 2 years of life. Likewise,
because the magnitude of thyroid doses from AT fallout from the
NTS and from global fallout was not greatly different from the
Hanford doses in many study subjects, tables showing the results
of analyses stratified by magnitude of NTS or global fallout are
potentially important.
OUT-OF-AREA ANALYSES
The HTDS investigators took care to examine the
results for the out-of-area participants, those who proved never to
have been in the dosimetry area during the time of 13~{ exposure.
They performed sensitivity analyses in which the out-of-area
participants with disease were assigned either the minimal (zero)
or maximal (at the dose-assessment area boundary) likely dose and
those without disease were assigned the converse. The two
contrasting analyses test the minimal and maximal contributions,
respectively, that the out-of-area subjects could make to the dose-
response analyses. Either way, the overall results were essentially
unaffected, and this indicates that their deletion from the main
analyses did not produce a substantial bias. The effect of these
cases was small probably because only about 7°/0 of the subjects
were out-of-area and the assigned doses for these subjects in the
sensitivity analyses were relatively small (~-5 ~ mGy).
However, the HTDS investigators made no attempt to
mode] the out-of-area doses for persons who were included in the
main analyses. That is, if persons were in their dose-assessment
OCR for page 86
Evaluation of Statistical Data Analysis
93
area for only part of the time when there were 13~{ releases, their
doses were calculated only for the time when they were in the area.
The investigators implicitly assumed that the dose was zero for any
time when a person did not reside in the area. That assumption
might or might not have been valid for some individuals, but no
attempt was made to improve on the approach or to conduct a
sensitivity analysis to evaluate how the assumption could have
affected the results. That approach could have led to attenuated or
biased results in that it estimated the total Hanford fallout doses for
some people and only partial doses for others.
There was not even a tabulation of the fractions of the
dose-modeled persons that were partly in and partly out of the
dose-assessment area during the exposure period or, what would
have been better, what fractions of them were out-of-area during
the period of heaviest exposures (1944-1947), out-of-area only
during other exposure periods, and entirely in-area. The committee
cannot evaluate the potential for attenuation or bias by this factor
without at least some information on its frequency, and we
recommend that the issue of partial out-of-area HTDS subjects be
examined.
GEOSTRATUM VERSUS. DISEASE
The HTDS investigators examined thyroid morbidity
according to geographic areas, which they called "geostrata".
Given that outcomes (disease or abnormalities) appeared to differ
by geostratum, an alternative analysis that stratified by geostratum
would be natural to consider. It would be difficult for thyroid
carcinoma (owing to the few cases detected), but many of the other
outcomes could be analyzed so that the dose-response relationships
were estimated for the individual geostrata and then combined to
yield a pooled dose-response estimate. Additional analyses are
presented that are based on excluding the Okanogan and Ferry-
Stevens geostrata; this could well have effects on the dose-
response estimates similar to those of a stratified analysis, but one
cannot be sure from the writeup. A set of analyses stratifying on
OCR for page 86
94
Review of the HTDS Draft Final Report
geoskata seems needed because the tabulations show that the
disease-rates tended to be higher in areas with low fallout, this
means that the geostratum differences would induce a negative
association between i31T and thyroid-nodule rates. It is recognized
that it can be tricky to conduct an analysis controlling for a
variable that is correlated with dose, because one does not want to
control (remove) a large fraction of the variability in dose; but in
this case, when it appears that geostratum is a potent confounder of
the dose-response association, it seems necessary. Perhaps a
judicious collapsing of similar geostrata can minimize the potential
for "overadjustment" (Day and others, 1980) of the exposure
variable.
Faced with a similar problem in the study of Utah NTS
fallout and thyroid disease, Kerber and others (1993) conducted
their primary analysis with stratification on coarse geostrata (by
state), examining the association of thyroid neoplasms and 13lI
dose within geostrata. it is recommended that the Hanford
investigators perform a similar type of analysis to examine the
possible association of thyroid nodules and other thyroid diseases
with 13lI dose. This would provide assurance that a possible
confounding variable had been sufficiently evaluated, either to
ensure that a positive association was not masked by the
geostratum variations or to detect a masked association.
~ ~ A
GENERAL-POPULATION COMPARISON AND SCREENING ISSUES
When one takes into account the different contributions
of 13lI from Hanford, NTS, and global fallout from weapons
testing, everyone was exposed, so it was not possible to identify an
unexposed control group. Concern has been expressed that a stuciv
· . . ~· . .
~ ,
In Which everyone IS exposed IS not valid that an unexposed
group is needed to assess the risk posed by Hanford 1311 fallout.
However, under the weak assumption of a monotonic dose-
response relationship (that is, other things being equal, the larger
the dose the greater the thyroid-cancer risk), it is not necessary to
have an unexposed control group to estimate the risk. The slope of
OCR for page 86
Evaluation of Statistical Data Analysis
95
the dose-response curve would provide a valid index of the risk
even without an unexposed control group, provided that there is a
sufficient range of doses and that the doses are estimated with
reasonable accuracy. Problems in trying to define and use an
unexposed control group are discussed below.
The primary analyses of the cumulative incidence of
thyroid cancer or other thyroid conditions were dose-response
analyses of the subjects in the study. These analyses are
appropriate to address the scientific questions regarding the
association between IT and thyroid conditions, the magnitude of
risk per unit dose, and the public-health question of how much risk
was associated with Lit in the population of children who were
downwind from Hanford. Another potential way to address the
public-health question is to compare the incidence of thyroid
cancer or other thyroid conditions with the incidence in unexposed
populations. However, comparisons with an external, general
population are fraught with problems. Persons living in various
geographic areas might vary in their baseline risk of thyroid
diseases because of differences in dietary iodine intake and other
unknown factors. Perhaps more important, the rates of detected
disease are based on examinations and depend on the methods and
criteria of the examinations; this produces screening effects that
cannot be readily disentangled to make meaningffi} comparisons
with disease-rates from other geographic regions that did not have
comparable screening.
The HTDS investigators attempted to compare the
number of thyroid cancers that they detected with the number
expected in the general population. They reported that the observed
number of thyroid cancers and the number expected in the general
population were almost identical. To do that, they had to introduce
a factor to account for their study group's having received a
thorough thyroid screening, whereas the general population by and
large has not received one. They chose a screening factor of 3,
which had been reported in a 1985 monograph on radiation-
induced thyroid cancer (NCRP, 19851. But that factor was based
OCR for page 86
96
Review of the HTDS Draft Final Report
on only indirect evidence: specifically, the prevalence of nodules
found by screening in two studies was multiplied by 0.! or 0.12 at
various ages because a third study found that about 10-12% of
nodules were malignant, and this result was compared with the
incidence reported in a national survey, which proved to be one-
third as high as the prevalence found in the two screening studies.
That is a weak and questionable basis for choosing a multiplier of
3-there could have been unaccounted-for differences among the
studies, and the screenings involved only palpation and not
ultrasonography, as in the HTDS.
More recent studies, which were available but not cited
by the HTDS investigators, have produced different values for a
screening factor. For example, the study of atomic-bomb survivors,
which at different times involved only palpation or palpation plus
ultrasonography, produced a screening factor of 2.5 (Thompson
and others, 19941. A study in Chicago with a sensitive screening
technique produced screening factors of about 7 for thyroid cancer
and 17 for thyroid nodules (Ron and others, 1992~. The
discrepancies in those values indicate that there is a great deal of
uncertainty in the appropriate size of the screening factor, and the
different values could allow one to conclude that those residing
near Hanford had anywhere from a large deficit to a small excess
of thyroid cancer. Hence, there is no unambiguous answer.
The HTDS Draft Final Report does not indicate any
attempt to compare the HTDS thyroid-nodule prevalence with that
found in unirradiated populations. Reported prevalence rates in
unirradiated groups are available from about a dozen studies in the
literature, so, in principle, it is possible to do, although again there
would be a question about comparability with respect to screening
intensity.
In summary, in the subcommittee's conclusions drawn
from comparisons with general-population prevalence would
probably have more uncertainty than those drawn from dose-
response comparisons in the study population, so the HTDS
OCR for page 86
Evaluation of Statistical Data Analysis
97
investigators rightly chose to emphasize the internal comparisons
rather than general-population comparisons.
ANALYSES OF SOURCE OF PERSONAL-EXPOSURE INFORMATION
One major component of the determination of
individual I'll exposures was the milk-drinking habits of the study
subjects. An attempt was made to interview a parent of each
subject or other knowledgeable surrogate to obtain recollections of
the milk consumption of the subject in childhood in terms of
quantity and sources of milk at various ages. However, for 38°/O of
the subjects it was not possible to interview a parent or surrogate,
in which cases default assumptions were used in calculating
thyroid dose. The defaults that the CIDER mode! used proved to
result in considerable overestimates of the average dose derived
from the reported milk consumption and sources. Specifically, the
doses using default values were 40°/O higher than the average dose
of those interviewed. For the critical group who were infants in
1945-1946, the discrepancy was even greater: the doses using
default values were 77% higher than the average of interview-
estimated ones.
A table showing mean doses by amount of milk
consumption in a given geo stratum would be illuminating in
indicating the degree to which dose variations were driven by milk
consumption versus geographic location. That is important for
understanding the degree to which the study's negative results
might have occurred because of lack of reliability or validity in the
reported milk-consumption rate. If a large fraction of the variation
in dose is attributable to milk-consumption variation, the random-
error component of the dose estimates is probably large,
considering that Dwyer and others (1989) found a correlation of
only 0.3 between contemporaneous reports and long-term recall of
milk-drinking habits; this implies that one would not be likely to
detect a dose-response relationship. A similar table giving mean
doses by source of milk information (interview versus defaults)
and geo stratum would also be informative.
OCR for page 86
98
Review of the HTDS Draft Final Report
Analyses that take into account the source of milk
inflation are needed. The HTDS investigators did perform
secondary analyses that used defaults for those without interviews
on the basis of average reported quantity and sources of milk, and
they indicated no association, but actual results were not presented.
A useful analysis would examine associations using only those
with interview information so as to yield results that minimize dose
misclassification. Section 6 of this report describes the effect of
milk-consumption measurement error on the statistical power of
the study.