Appendix C
Primer on Bayesian Method
The Bayesian approach to statistical inference allows scientists to use prior information about the probability of a given hypothesis or other pieces of a model and to combine it with observed data to arrive at a “posterior”—the probability of the hypothesis given the observed data and our prior information. A simple illustration is HIV testing. Suppose that the hypothesis is about whether John Smith is infected with HIV. And suppose that the evidence is whether a new blood test comes out positive or negative. The abbreviations are as follows:
H = John Smith has HIV
~H = John Smith does not have HIV
E = blood test for John Smith is positive
~E = blood test for John Smith is negative
One wants to determine the probability that John Smith has HIV after receiving the results of a blood test. Suppose that the test is 95% reliable; this means that among those who have HIV, the test will read positive 95% of the time, which can be represented as P(EH) = 0.95. And suppose that the falsepositive rate is tiny (only 1%). That is, among those who do not have HIV, the test will read positive 1% of the time, which can be represented as P(E~H) = 0.01.
Now suppose that John Smith is routinely screened for HIV with the new blood test and that the test comes back positive. After being informed of the result, he panics because he imagines that he has a 95% or 99% chance of having HIV. That conclusion is not correct. The fundamental theorem, attributed to the Reverend Bayes in the 18th century, is simple in this case to state as follows:
The equation states that the probability of the hypothesis H given the evidence E (the posterior) is equal to the product of the probability of the evidence E given that H is true (the likelihood) and the probability of H before any evidence (the prior) is provided divided by the probability of E. To avoid computing P(E), scientists sometimes consider the ratio of the posterior of H and ~H, after E is seen, which in this case is as follows:
Suppose that, before seeing a blood test, one had no idea whether John Smith had HIV and translated one’s ignorance into a 5050 probability by saying P(H) = P(~H) = 0.5. Then the ratio
Below are the first 10 and last 10 pages of uncorrected machineread text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapterrepresentative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 150
Appendix C
Primer on Bayesian Method
The Bayesian approach to statistical inference allows scientists to use prior information
about the probability of a given hypothesis or other pieces of a model and to combine it with
observed data to arrive at a “posterior”—the probability of the hypothesis given the observed
data and our prior information. A simple illustration is HIV testing. Suppose that the hypothesis
is about whether John Smith is infected with HIV. And suppose that the evidence is whether a
new blood test comes out positive or negative. The abbreviations are as follows:
H = John Smith has HIV
~H = John Smith does not have HIV
E = blood test for John Smith is positive
~E = blood test for John Smith is negative
One wants to determine the probability that John Smith has HIV after receiving the results
of a blood test. Suppose that the test is 95% reliable; this means that among those who have HIV,
the test will read positive 95% of the time, which can be represented as P(EH) = 0.95. And
suppose that the falsepositive rate is tiny (only 1%). That is, among those who do not have HIV,
the test will read positive 1% of the time, which can be represented as P(E~H) = 0.01.
Now suppose that John Smith is routinely screened for HIV with the new blood test and
that the test comes back positive. After being informed of the result, he panics because he
imagines that he has a 95% or 99% chance of having HIV. That conclusion is not correct. The
fundamental theorem, attributed to the Reverend Bayes in the 18th century, is simple in this case
to state as follows:


The equation states that the probability of the hypothesis H given the evidence E (the
posterior) is equal to the product of the probability of the evidence E given that H is true (the
likelihood) and the probability of H before any evidence (the prior) is provided divided by the
probability of E. To avoid computing P(E), scientists sometimes consider the ratio of the
posterior of H and ~H, after E is seen, which in this case is as follows:

  . 95
~  ~ ~ ~ ~ . 01 ~
Suppose that, before seeing a blood test, one had no idea whether John Smith had HIV and
translated one’s ignorance into a 5050 probability by saying P(H) = P(~H) = 0.5. Then the ratio
150
OCR for page 150
Appendix C 151
above would equal 95, so P(HE) = 0.9896, which means that John Smith probably has HIV.1 But
suppose that instead of saying that John Smith has a 50% chance of having HIV before one sees
a test, one assesses his prior probability of having HIV as the frequency of HIV in people of his
age, sex, sexual habits, and drug habits. If John Smith is 30 years old, a middleclass American,
heterosexual, and monogamous and does not use any illicit drugs that require needles, his prior
might be the frequency of HIV in that group, which might be as low as 1 in 10,000. In this
problem, that frequency is referred to as the base rate. If we use P(H) = 0.0001 , the posterior
looks much different:
 0.95 0.0001
0.0095
~  0.01 0.9999
in which case P(H  E) = 0.0094. Thus, with a base rate of 1 in 10,000, John Smith has less than a
1% chance of having HIV, even though his blood test was positive and the test is a highly relia
ble one. In that case, the Bayesian approach allows one to incorporate base rates easily and test
reliability into a calculation of what one actually cares about: the probability of having HIV after
getting a test result.
In more general settings, the Bayesian approach can be used to transfer prior knowledge in
one part of a model effectively into posterior knowledge in another part of the model of interest.
For example, suppose that the basic causal model of the effect of exposure to lead on a child’s
developing brain is as follows:
Parental
Resourcefulness
Lead Cognitive
Exposure function
In this model, , the parameter of interest, represents the size of the effect of lead on
cognitive function.2 can be estimated from the observed association between lead exposure and
cognitive function after adjusting for parental resourcefulness. One problem, especially if one
needs to be able to detect statistically even a fairly small , is that one must be able to measure
parental resourcefulness precisely and reliably.
Suppose that socioeconomic status (SES), such as the mother’s education and income, is
used to measure parental resourcefulness.
1
This is because P(HE) + P(~HE) = 1, and P(HE)/P(~HE) = 0.95, which entails that P(HE) = 0.9896
and P(H~E)= 0.0104
2
Prior beliefs about β can be incorporated directly into a Bayesian model that is used to compute one’s
degree of belief about β after seeing data. Prior beliefs about other parts of the model will influence the
posterior degree of belief about β indirectly.
OCR for page 150
152 Review of EPA’s Integrated Risk Information System (IRIS) Process
SES
Parental
Resourcefulness
Lead Cognitive
Exposure function
Then the estimate of will be biased in proportion to how poorly SES measures parental
resourcefulness relevant to preventing a child from being exposed to lead and relevant to
stimulating the child’s developing brain. The worse SES is as a measure of parental
resourcefulness, the more biased the estimate of . On a scale of 0100, where 0 means that SES
is just random noise and 100 means that it is a perfect measure of parental resourcefulness, is
SES a 95? 55? A sensitivity analysis would build a table in which the estimate of is displayed
for each possible level of the qualityofmeasure scale of SES, making no judgment about which
level is more likely. That can be extremely useful because it might reveal, for example, that as
long as one assumes that SES is above a 30 on the qualityofmeasure scale, the bias in the
estimate of β is below 50%.
Perhaps it is not known where SES sits on a qualityofmeasure scale precisely, but one’s
best guess is that it is 70, and one is pretty sure that it is between 50 and 90. Then, a Bayesian
analysis can incorporate this prior information into a posterior over . For example, after
eliciting information on the amount of measurement error in SES, one can conduct a Bayesian
analysis of the size of that might produce the plot below. The Xaxis shows the size of
(which in a simple linear model is the size of the IQ drop that one would expect in a 6yearold
after an exposure to enough lead to increase blood lead by 1 g/dL), and the Yaxis is the
posterior probability of , given our prior knowledge and the data that have been measured in 6
yearolds.
P()
As can be seen, the modal value for in the posterior is somewhere around 0.2. The
spread of the distribution expresses uncertainty about . Roughly, it shows that the bulk of the
posterior distribution over is roughly between about 0.4 and 0.04. If is in fact 0.2,
OCR for page 150
Appendix C 153
increasing a child’s exposure to lead by an amount that would produce a 20g/dL increase in its
blood concentration would cause an expected drop in IQ of 4 points.3
In an IRIS assessment, the analogue of is any parameter that expresses something about
the doseresponse relationship in humans. Prior knowledge that a Bayesian analysis might
incorporate includes
The degree to which animal data on rodents are relevant to humans.
The degree to which mechanistic information informs the doseresponse relationship in
humans.
The amount of confounding that might still be unmeasured in epidemiologic studies.
The quality of the measures of exposure in epidemiologic studies.
Although prior elicitation is important for choosing good informative priors, in some
situations, particularly if data are sufficient, moderately informative or even noninformative
priors might be sufficient. The major danger with Bayesian models for metaanalysis comes with
specifying prior distributions for the betweenstudy variance because information for this
parameter is limited by the number of studies available and not by the size of each study. Typical
noninformative priors do not work well, and some care must be taken to choose one that is
sufficiently informative. Enough is often known to establish reasonably loose bounds that enable
estimation, although sensitivity analyses that check how much the final answer is affected by
prior choice are still necessary.
3
Children often test now around 35 g/dL, but children in the 1970s, who were often exposed to lead
paint and air with a lot of lead from leaded gasoline, often tested at 2030 g/dL.
OCR for page 150