| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 429
On the Accuracy of Estimates of Numbers of
Intravenous Drug Users
Bruce D. Spencer
The purpose of this paper is to assess the accuracy of estimates of the
numbers of intravenous drug users (IVDUs) that were published in
the November 30, 1987 report "A Review of Current Knowledge and
Plans for Expansion of HIV Surveillance Activities" (hereafter, the
"Review"), submitted by the Centers for Disease Control (CDC) to
the Domestic Policy Council.i The purpose is not to develop alterna-
tive estimates but rather to see whether the published estimates are
accurate enough to be relied upon for estimating and forecasting the
extent of human immunodeficiency virus (HIV) infection. As will be
made clear, the published estimates are fraught with problems. The
accuracy of the estimates is not ascertainable by objective means
because the estimates are based largely on guesses. In the author's
judgment, the estimates could well be off by a factor of 2, in either
direction.
Although this paper will appear to be highly critical of the
ways that various researchers and agency personnel have estimated
numbers of IVDUs, appreciation and respect are owed to researchers
for producing the estimates that we have of numbers of {VDUs.
Considerable ingenuity has been used by a number of researchers
The author is from the Department of Statistics, Northwestern University, and the
Methodology Research Center, NORC. He is grateful for helpful comments from Lincoln
Moses, John Newmeyer, I. Richard Savage, and James Schmeidler.
1The report later appeared as the supplement to the December 18, 1987 Morbidity and
Mortality Weekly Report, 36, No. S-6.
429
OCR for page 430
430 ~ BACKGROUND PAPERS
in attacking this difficult estimation problem. This paper does not
say that the estimates should have been produced by alternative
procedures; indeed, few constructive suggestions are made. Rather,
it concludes that the estimates are simply highly inaccurate and form
a weak basis for any policy or program decisions.
Before proceeding to discuss the estimates themselves, the reader
should understand the statistical concept of accuracy. The following
discussion is intended to elucidate, rather than confuse; the reader
who wishes to skip it should proceed to the next section.
Let X refer to the publisher] estimate of the number of IVDUs
and let ~ refer to the true number of IVDUs, that is, the number to
be estimated. One may think of X as fixed and known, since it is
published, and of ~ as fixed but unknown. The error in X, namely c,
is the difference between X and 8; ~ = X - §. When the accuracy of
X is referred to, a statement is being made about beliefs concerning
the magnitude of e. Except on rare occasions when ~ is known
with a much smaller error than c, the value of ~ is uncertain. This
uncertainty will be represented by a probability distribution for c.
The key features of the distribution are its mean, which measures the
bias or systematic error in X, and its variance, which measures the
unreliability of X. The bias may be decomposed into the sum of two
component biases, one arising because X is designed to measure the
wrong concept (i.e., the wrong definition of {VDU underlies X) and
the other arising from systematic error in the measurement process.
Summary measures of accuracy may be developed in various ways,
but the most common is root-mean-square error (RMSE), the square
root of the sum of the variance and the squared bias.
Two alternative interpretations of probability will now be dis-
cussed. Uncler one interpretation, the "relative frequency" (RF) in-
terpretation, the estimate X is viewed as a realization of a stochastic
process; for example, if X is based on the results of randomly selected
samples, then the randomization inherent in the sampling gives rise
to a probability distribution for X (i.e., different samples would give
rise to different estimates of X). Other, nonsampling bases for error
in X can also be endowed with a randomization distribution; for
example, response errors in surveys can be viewed as occurring at
random (perhaps with bias as well as variance). For another example
(see "Newmeyer's Ratio Estimate" ), consider ratio-estimators X of
the form X = AY, where Y is the number of reports of drug-related
emergency room episodes reported in the Drug Abuse Warning Net-
work (DAWN) and A is an estimate of A, with A defined as the ratio
of ~ to the expected value of Y. Here Y is random and A has error
OCR for page 431
NUMBER OF IV DRUG USERS
~ 431
when viewed as an estimate of A, and the latter error and Y may be
correlated.
Not all estimates X are conducive to the RF interpretation; for
example, some estimates of X are called "informed guesstimates"
(never uninformed guesstimates, though) and are not produced by a
repeatable process. In such cases, a relevant notion of probability is
based on the view that a person's beliefs about ~ may be represented
by a probability distribution; this interpretation will be referred to
as a "personalistic Bayesian" (PB) perspective. The PB viewpoint
explicitly incorporates subjective opinion, whereas the RF viewpoint
does not. However, application of the RF model inevitably uses
subjective opinion in the specification of the randomization model
to be used to assess the mean and variance of the distribution. In
either case, RF or PB, the problem of whose opinion arises. In this
paper, the author's opinion is represented. Coming from a nonexpert
in the area of drug abuse, these opinions represent those of experts
whose views have been encountered as well as a sufficient degree
of skepticism acquired from experience in assessing the accuracy of
demographic statistics and from watching experts be overconfident
about their expertise (see also Mosteller, 1977, in this regard).
WHAT IS THE DEFINITION OF INTRAVENOUS
DRUG USER?
The first problem with estimates of the number of intravenous drug
users is the definition of {VDU. Many {VDUs shoot opioi(ls, prin-
cipally heroin, ant! many do not. W. R. Lange (personal commu-
nication, 1988) believes that, at least in some communities such as
Baltimore, "approximately 20%0 of IVDUs are primary cocaine and
amphetamine users who (lo not on a regular basis abuse opioirls in-
travenously." Many varieties of patterns of needle use exist. Some
persons inject drugs subcutaneously ("skin poppers") and do not
even use needles; although they are not {VDUs if they do not use
needles, they are at risk of HIV infection to the extent that they
share injection equipment with drug users who are infected! with
HIV. Some IVDUs are hard-core addicts who inject drugs multiple
times each day; other IVDUs are successful upper middle class users,
who inject less frequently. Other {VDUs inject multiple times daily
for a few months and then stop; still others inject only a few times
per year. Some {VDUs have been shooting for a Tong time; others,
only recently. Whether an IVDU clid most injecting prior to 1975 or
in a later year will greatly affect the risk of HIV infection.
OCR for page 432
432 ~ BACKGROUND PAPERS
Gerstein (1976) distinguishes] four types of heroin-using {VDUs.
The first two types are hard-core users who are "strung out"; the
third type consists of "people with dual identities . . . who may be
strung out, but often at Tow levels.... These are people whose status
- as members of the heroin community is often secret, hidden from a
whole set of close associations: family and friends. These may or
may not be identified on the street as addicts, depending on their
mode of scoring ant! the stage of their careers." The fourth type is
{VDUs who are not usually strung out on heroin. "These may be
'chippies', 'weekenders', or 'sandbox users' as opposed to 'boot-and-
shoe addicts', 'stone junkies', '(lope fiends', and 'real hypes'. They
may or may not score directly on the street.... Sometimes these
peoples are situational heroin users the current lover of a strung-
out user, for example." Gerstein believes, based on field experience
and study of the epidem~ological literature on heroin that there are
about two or three type 4 IVDUs for every type 1-3 IVDU (Gerstein,
1976).
What is the right choice of definition? For estimating the number
of seropositive persons in the United States, the precise definition is
largely irrelevant as Tong as it is consistent with other definitions
being employed.
For example, Table 14 in the "Review" estimates the total num-
ber of infected people in eight (presumably nonoverIapping) popula-
tion subgroups:
1. exclusively homosexual,
2.
3.
other homosexual,
regular IVDU,
4. occasional {VDU,
5. hemophiliac A,
6. hemophiliac B.
7. heterosexuals without specific identified risks, and
8. others.
The estimates of total infected persons in each group are derived by
multiplying the number of persons in the group by the seropositivity
rate and summing. Thus, if Rj is the seropositivity rate for group j
and if Nj is the size of group j, the total number of infected persons
in the population is
Rat Nit + R2N2 + R3N3 + R4N4 + · · + R8Ns, .
If group 4 were split into two groups, say 4a and 4b, with rates R4a
and R4b and sizes N4a and N4b, then the total number of infected
persons in the population may be expressed as:
OCR for page 433
NUMBER OF IV DRUG USERS ~ 433
RiNi + R2N2 + R3N3 + R4aN4a + R4bN4b + + RsNs,
which equals the previous total because R4 = (R4aN4a + R4bN4b)/N4
and N4 = N4a+N4b. Thus, changing the definitions or compositions of
the groups does not affect the estimate of the total number infected as
long as the group sizes Nj and seropositivity rates Rj are appropriate
to the group as defined. (Of course the groups should comprise the
whole population ant! be nonoverIapping.)
This is not to say that for other purposes the definition is ir-
relevant. For example, for studying transmission of the infection,
one would wish to estimate seroprevaTence for TVDUs classified by
frequency of injection, how long they had been shooting, whether
they shared needles and with whom, if and how long they had been
infected, the stage of the infection, and so on. In this case, the defini-
tion should be focused so that the groups of {VDUs are homogeneous.
Thus, the definition of IVDU might well exclude or otherwise dis-
tinguish chippies from other users. The definition implicitly used by
Newmeyer (1988) in his method (clescribed later) has this feature.
WHAT DEFINITIONS ARE USED IN
THE "REVIEW"?
The "Review" distinguishes two types of IVD Us: regular users who
inject at least weekly and less frequent or occasional users who inject
less often than regular users but have used drugs more than once
or twice. Two sets of national estimates were produced for each
type of IVDU, the original Public Health Service estimates for 1986
~ "Review," Table 13) and the revised estimates ("Review," Table 14)
for 1987. These are listed in Table 1.
TABLE 1 Public Health Service (1986, 1987) Estimates of HIV
Prevalence Among IV Drug Users
Type of
IV Drug User
Estimated Approximate Total Number
Number Seroprevalence (%) Infected
Original estimate (1986)
Regular 750,000 30 225,O00
Occasional 750,000 10 75,000
Total 1,500,000 300,000
Revised estimate (1987)
Regular 900,000 25 225,000
Occasional 200,000 5 10,000
Total 1,10O,000 235,000
OCR for page 434
434 ~ BACKGROUND PAPERS
In assessing these estimates, critical questions concern what pop-
ulation definitions were actually used in estimating the total number
of each type of I~DU and in estimating the seroprevaTence rates.
How different are the operational definitions? How accurate are the
estimates for the populations as operationally defined?
WHAT POPULATION DEFINITION WAS USED
FOR ESTIMATING SEROPREVALENCE?
SeroprevaTence rates were estimated by a weighted average of sero-
prevalence rates for each state, with the weights proportional to the
estimated number of {VDUs in the state (cliscussed later). Seropreva-
lence estimates for a state were based on rates observed in specific
locations, primarily from data obtained from intraveneous (IV) drug
abuse treatment facilities in major cities (Dougherty, 1987~; presum-
ably these rates were not simply averaged across the entire state. The
data were mostly gathered in 1986 and 1987, but some were gathered
in 1984 and 1985. Because seroprevalence rates may change rapidly
upward in only a year (as seems to have occurred in New York and
Edinburgh), even the 1986 data are out of date for estimating sero-
prevaTence in 1987, let alone 1988. The effect of outdated data on
local seroprevaTence rates may be to bias the estimates downward to
an unknown but possibly appreciable extent.
John Dougherty has suggested that the rates, which "are from
the few drug abuse treatment facilities in which planned studies have
been conducted," may be underestimates. He noted that many "drug
treatment professionals report that IV drug users more likely to be
infected with HIV (i.e., engage in more risky behaviors) tend not
to seek treatment at drug centers where HIV testing is known to
occur but prefer programs in which there is not testing." However,
the extent of the avoidance of such treatment programs depends on
how much real choice a user has of treatment program and whether
testing is optional. Dougherty (1987) also notes that "several other
lines of evidence indicate that seroprevalence rates are higher for drug
users not receiving treatment than for those in treatment programs"
(although this author has not seen such evidence). Furthermore, the
(lrug treatment programs include many persons who no longer use
needles to inject drugs. At first blush, these considerations would
seem to indicate that the estimated seroprevaTence rates would be
low estimates (for the cities in which they were based). However
(even if the judgment of drug treatment professionals is accepted for
the moment), merely because the {V drug users more likely to be
OCR for page 435
NUMBER OF IV DRUG USERS ~ 435
infected tend not to seek treatment at the centers in the studies does
not mean that the rates are underestimatecI since the chippies and
other occasional users who may have very Tow rates of infection-
might be even more underrepresentecT in the studies. Whether the
rates are underestimated due to underrepresentation of the whole
IV drug-using population or not depends on who should be included
in the denominator of the rate, and that, as noted earlier, depends
on the inclusiveness of the definition of IVDU. The suggestion of
Newmeyer (1988) that the definition of IVDU exclude chippies has
merit; it is important, however, that they be exclucled both from
the clefinec! population of IVDUs and from the tested groups used to
estimate seroprevalence of IVD Us.
The kinds of settings in which the seroprevaTence estimates for
IVDUs were made inclucle methadone maintenance programs, drug-
free treatment programs, detoxification programs, the "street" (sero-
prevaTence testing performed in addicts not currently in treatment
programs), and other kinds of drug treatment programs. The clinics
and treatment programs were not selected at random, nor were mem-
bers in the programs selected at random. How well do the popula-
tions "sampled" match the population definitions used for developing
numbers of regular and occasional {VDUs? This is not known, but
there is no reason why they should be close; it has been impossible
to pin clown any expert with an opinion on this question. As shown
in the following section, the population definitions used for estimates
of numbers of {VDUs are exceedingly vague regarding frequency of
injection. How exactly were the seroprevaTence estimates developed
for each state and how were the differences between seroprevaTence
rates for regular and occasional users estimated? Again, we c30 not
know.
HOW WERE NUMBERS OF IVDUs
ESTIMATED?
Overview
The revised estimate of 1.1 million IVDUs was chosen by the CDC
from among several alternative estimates available at the time. The
alternative methods of estimation underlying the numbers in the "Re-
view" are Ascribed in the next three sections. These methods are
based on aggregation of state agencies' estimates, on national-level
estimates for different kinds of (drug abusers, and on the assumption
(Lange's approach) that the proportions of the adult population in
large cities who are IVDUs are constant within classes of cities. The
OCR for page 436
436 ~ BACKGROUND PAPERS
Newmeyer ratio estimate represents an alternative method that was
not considered by the CDC in choosing its estimate.
Three general kinds of techniques are used for estimating the
numbers of{VDUs. Terminology varies somewhat, but the practice of
the National Association of State Alcohol and Drug Abuse Directors
(NASADAD) will be used (Butynski et al., 1987:41-42) to classify
them as "direct," "indirect," or "informed guesstimates." Direct
estimates may be based on the National Household Survey on Drug
Abuse (NHS) run by the National Institute on Drug Abuse (NTDA),
on dual-systems estimates, or on backward extrapolation. Indirect
estimates are based on a fitted regression model that attempts to
relate indicator data such as number of burglaries or heroin-related
deaths to prevalence of IVD Us. Estimates of numbers of {VDUs
are obtained by substituting observed (or predicted) values of the
indicators into the regression model. Note that the estimation of the
regression coefficients depends on the availability of direct estimates.
Finally, so-called informed guesstimates are produced by one or more
people looking at any available indicators or other correlates of TVDU
prevalence and making a loose guess about the number of TVDUs.
Both the indirect estimation and the informed guesstimation are
dependent on direct estimates somewhere along the line. Thus, the
accuracy of any estimates of the number of {VDUs can be no greater
than the accuracy of the direct estimates. A brief description of the
three direct estimation methods is now provided.
The NHS attempts to measure drug use and prevalence in the
general household population aged 12 and over. The survey excludes
by design persons living in transient households or in institutions (in-
cluding university dormitories and prisons), persons in the military,
ant! persons with no fixed residence. Many heroin users will, there-
fore, not be covered by the survey. Also, even within the population
not excluded, the survey suffers from biases due to nonresponse and
underreporting of (lrug use. Thus, cocaine and amphetamine users
may be covered by the survey but may decline to report themselves
as drug abusers. The extent of bias in the survey results is unknown
but potentially enormous.2
Back-extrapolation methods use data on AIDS deaths and HIV
2For example, the decennial census is believed to miss as many as 15 percent of all
black males aged 20-44; the percentage of IVDUs missed by the census is surely much
higher. If 30 percent of the drug abusers are not covered by the census, and if only one
in two drug abusers surveyed would admit being a drug abuser, then the number of drug
abusers reported by the survey will tend to be only 35 percent of the actual number.
OCR for page 437
NUMBER OF IV DRUG USERS | 437
prevalence to estimate the number of IVDUs. Based on seropreva-
lence studies and on models of disease progression, estimates of the
probability that an IVDU will have had AIDS are represented, for
example, by P. Then estimates of the number of AIDS cases among
{VDUs are developed, N. The number of TVDUs is then estimated
by the ratio N/P. Such a technique was used by Newmeyer (1988)
to estimate the number of IVDUs in New York City; his method is
discussed at the end of this sec-tion.3
Difficulties with the method are attributable to errors in the
estimates P ant! N. The estimate P-is based on nonrandom samples
subject to severe but unknown selection biases, as well as errors in
the model of disease progression. The estimate N must successfully
correct for underreporting of AIDS deaths. In- or out-migration of
IVDUs from the geographic area of interest is also a problem, but
probably of lesser magnitude. An acivant age of this technique is the
consistency in the definition of {VDU, as discussed earlier.
Dual-systems estimates, also known as tag-recapture estimates,
are perhaps the most wiclely used direct estimates. One begins with
two lists of {VDUs, for example, a list of persons in heroin treatment
programs and a list of persons treated in emergency rooms for ad-
verse reactions to heroin abuse. If there are a people on both lists, b
people on the first list but not the second, and c people on the second
list but not the first, then one may estimate the probability P that
an {VDU is on the first list by the fraction of people on the second
list who are also on the first list, P = a/(a + c). The number of TVDUs
who are on the first list is N = a + b. Then the total number of IVDUs
is estimated by N/P, or (a + b)(a + c)/a. A number of problems with
(lual-systems estimates have been recognizecl, both in the drug abuse
literature and in the statistical literature (WoTter, 1986~. Problems
include errors in classifying individuals as to membership on one list
3Newmeyer also describes a variation on this method in his unpublished note "Four
Readily Applicable Methods to Estimate Drug Abuse Prevalence" (personal communi-
cation, 1988~:
The method is to ascertain from the medical examiner's "coroner's] toxicol-
ogy data the relative proportion of decedents with methadone metabolites to
decedents with heroin metabolites (metabolite is simply a chemical produced
by the body's metabolism of a drug.) If this ratio is, say, 1 to 5, and if it is
known that there are 2,000 persons receiving methadone maintenance in the
city, it can be hazarded that there are about 10,000 persons attempting to
maintain themselves on heroin.
Note that the validity of this method rests on an assumption of equal death rates for
methadone and heroin users. For other discussions of the back-extrapolation method,
see Brookmeyer and Gail (1986, 1988) and Medley et al. (1987~.
OCR for page 438
438 ~
BACKGROUND PAPERS
or two, possible multiple appearances of an IVDU on a list, variabil-
ities in the selection probabilities for different-types of individuals
(some IVDUs are more likely to appear on the list than others),
migration of TVDUs in and out of the geographic area between the
times that the two lists are created, and causal effects of being on
one list enrollment in a treatment program may lessen the chance
of a drug-related emergency room incident. The errors arising from
variability in selection probabilities and causal effects are sometimes
jointly referred to as "correlation bias." Triple-systems estimates are
sometimes advocated (Woodward et al., 1984), but the same basic
problems remain.
Aggregation of State Agencies' Estimates
State drug abuse authorities submit plans for treating drug abuse
to the Alcohol, Drug Abuse, ant! Mental Health Administration
(ADAMHA). These plans contain estimates of numbers of IVDUs in
each state. The numbers are developed by the states from a variety
of sources, including surveys, treatment data, and drug indicator
data. It is worth noting that the state agencies compile the num-
bers for political purposes, inclucling obtaining block grant funds
from ADAMHA (which would put upward! pressure on the numbers)
and informing state legislators (which could put upward or down-
ward pressure on the numbers, depending on the state). NIDA staff
reviewed the states' plans and interviewed the agency directors in
California, Illinois, Michigan, New Jersey, New York, and PennsyI-
vania to confirm the numbers reported in the plans. Those reviews
merely verifier! that the numbers were reported correctly and not
that the numbers reported were accurate. In this way, the total
number of {VDUs was estimated at 1.28 million.
Also, accorcTing to Butynski et al.
asked each state alcohol and drug agency to:
(1987:41-42), NASADAD
provide estimates relating to IV drug abuse for Fiscal Year (FY) 1986 for
the total number of client admissions to treatment and total number of IV
drug abusers in the.State.... Seventeen (17) states provided data on the
total number of IV drug abusers in the State.4 The highest estimates of IV
drug abusers were provided by New York, California, and Texas, in that
order.... States were also asked to report the basis for their estimates of
4The following states were excluded: Alabama, Alaska, Arizona, Arkansas, Colorado,
Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Indiana, Kansas, Ken-
tucky, Louisiana, Maine, Montana, Nebraska, Nevada, New Mexico, North Carolina,
North Dakota, Ohio, Oklahoma, Oregon, Puerto Rico, South Carolina, South Dakota,
Tennessee, Utah, Vermont, Virginia, Virgin Islands, Washington, West Virginia, and
Wyoming.
OCR for page 439
NUMBER OF IV DRUG USERS ~ 439
the total drug abuser population. The largest number of responding States
t10] reported that their estimates were based upon indirect measures or
indicators. Three States reported that only "guesstimates" were used, and
four States indicated that a combination of methods [i.e., a combination of
a "guesstimate" and another method] was used.
The total number of IVDUs reported for the 17 states was 1.067
million. If the state plan estimates for the other 33 states, Puerto
Rico, the District of Columbia, and the Virgin Islands are included,
the total increases to 1.447 million.
The national estimates described above can be no better than
the state estimates they are based on. Individual states use quite
different methods to estimate the number of TVDUs. Estimation
methods for three states (Illinois, California, and New York) and
three cities (Chicago, Los Angeles, and New York) are now described.
Illinois and Chicago
The estimate for Illinois is based on data from the Client Oriented
Data Acquisition Process (CODAP) and the National Householc!
Survey on Drug Abuse. CODAP collects limiter! demographic and
drug-use information about clients in participating federally or state-
funded drug abuse programs. Illinois uses the CODAP and the NHS
data to estimate several components. First, the Illinois Department
of Alcoholism ant! Substance Abuse uses the CODAP data for all par-
ticipating programs to estimate the fraction Fit of narcotics abusers
in Illinois who inject drugs. Second, the proportion F2 of the U.S.
population who are narcotics abusers is estimated from data col-
lected by the NHS. Finally, the estimate of the number of IVDUs
in Illinois is derived as PF2 x Fit, with P equal to the total popula-
tion of Illinois. The method for estimating the number of IVDUs in
Chicago is analogous, with Chicago figures used instead of Illinois
figures (Jerome Gross, personal communication, May 20, 1988~. The
downward! bias in the NHS-based estimate of F2 makes the estimates
very untrustworthy.
—- J
California and Los Angeles
The estimates of the number of IVDUs in California are formed as
the sum of the estimates of numbers of IVDUs shooting amphetamine
and cocaine and those shooting opiates. Apparently, persons who in-
ject more than one of these ([rugs are counted more than once. First,
let us consider estimation of amphetamine- and cocaine-shooting IV-
DUs. NHS 1983 data from the western region of the country were
OCR for page 440
440 ~ BACKGROUND PAPERS
used to estimate a prevalence rate for the use of amphetamine and
cocaine. The estimated prevalence rate was multiplied by the size
of the at-risk population to estimate the number of amphetamine
and cocaine users. Data from DAWN and the California Drug Abuse
Data System (CALDADS) as well as "street data" were used to
estimate the fraction of drug abusers who administered the drugs
intravenously; the estimates were subjectively adjusted downward
slightly to account for selection bias. Then the latter figures were
multiplied by the estimated number of amphetamine and cocaine
users to estimate the number of amphetamine and cocaine shoot-
ers in 1985. Those figures were assumed constant at 97,000 for the
years following 1985. As with TIlinois, the downward bias of the
NHS prevalence rate subjects the California estimate of number of
amphetamine and cocaine shooters to great downward bias.
The number of persons shooting opiates is based on extrapola-
tion from a nonlinear regression model fitted to data more than 10
years old. Specifically, a regression was fitted to predict estimates!
prevalence rates for fiscal years 1974-1975, 1975-1976, 1976-1977,
and 1977-1978 from indicators including number of drug-incluced
deaths, burglaries, new admissions to treatment programs, and num-
ber of hepatitis B cases. The prevalence estimates were based on
work by John Newmeyer.5 A quadratic trend was fitted, which leads
one to believe that the mode] was overfitted (only four data points
were used to estimate at least three parameters). Rates of prevalence
for future years were precTictecT from values of the indicator data from
those years. The moclel was last run on fiscal year 1982-1983 data,
for which the estimate was 102,500. The estimate was subjectively
adjusted to 125,000 for fiscal year 1986-1987; numbers are assumed
constant since then. The extent of extrapolation, overfitting, and
possible weakness in the direct estimates used to calibrate the regres-
sion model makes these estimates untrustworthy. The total estimate
of IVDUs in California is thus 220,000. The number is taken to re-
fer to hard-core users (Susan Nisenbaum, personal communication,
1988).
The estimate of the number of {VDUs in Los Angeles County
is based on multiplication of the number of {VDUs in treatment
by a factor. The factor is about 4 or 5 and is based on examina-
tion of drug indicator data, but explicit models were not used. The
5The manner in which Newmeyer's estimates were prepared is not known, but it is
believed that they were based on dual-systems estimation (Susan Nisenbaum, personal
communication, 1988~.
OCR for page 441
NUMBER OF IV DRUG USERS ~ 441
estimate (about 70,000) includes regular users and occasional users;
including users in remission might push the estimate up to 120,000
or more (Donald McAllister, personal communication, 1988~. As the
accuracy of the factor is unknown, so too is the accuracy of the
estimate. However, there is no reason to assume that the accuracy is
any higher than that for Chicago.
New York State and New York City
The estimate of the number of {VDUs in New York State is derived as
the ratio of the New York City estimate to the fraction F of narcotic
abusers in the state who are estimated to reside in New York City;
F is estimated from the NHS (Schmeidler and Frank, no... The
current estimate for New York City is 240,000. The accuracy of F
may be reasonably good to the extent that biases in the NHS cancel
out, since F is the ratio of two NHS estimates of narcotic abusers.
The estimate of the number of IVDUs in New York City is based
on extrapolation of past estimates of prevalence. In brief, the past
estimates were based on regression estimation using drug indicators
as predictors, and the regression models were estimated by using
a combination of Narcotics Registry and clual-systems estimates.
The method is complicated because some of the predictor variables
became unavailable and had to be predicted on the basis of other,
available indicators. The models are designed to estimate the number
of heroin adcTicts, and those estimates are used without modification
to estimate numbers of IVDUs in New York City.
A somewhat more detailed summary of the estimation of the
number of IVDUs in New York City follows (Schmeidler et al., 1978~.
The discussion is longer than that for the other cities because the
method is more complicated, more information was available on how
New York City's estimates were prepared, and the estimates of the
number of {VDUs in New York City are especially critical (see the
discussion of Newmeyer's estimate, below).
. The first principal component of 12 indicators of heroin
use was computed and used as a predictor variable for
regression estimates of numbers of heroin addicts.
. Values of the "dependent variable" (numbers of heroin
addicts) for 1970-1974 were provided by the numbers
of heroin addicts reported in the New York Narcotics
Register, adjusted for duplication, reporting of (1rugs
other than heroin, incarceration, relocation, death, non-
addiction, and false inclusion. These adjustments were
OCR for page 442
.
442 ~ BACKGROUND PAPERS
largely guesses (particularly the adjustments for death,
relocation, and incarceration) and are problematic. The
numbers were also adjusted6 by dual-systems estimates
of the numbers of heroin addicts not reported in the
Register. Absence of data on the extent of drug abuse
or demographic characteristics for persons in the Regis-
ter meant that no posts/ratification could be employed
and correlation bias could be high, leading to clownward
pressure on the estimates. Matching errors (false non-
matches) probably occurred, leading to upward pressure
(see Wooc~warcT et al., 1984~. Further upward pressure
is present because the population is not closed and the
adjustments for out-migration from the population were
imperfect. Sampling error is also present. In this way,
estimates of the number of heroin aciclicts in 1970-1974
were prepared.
. Then regression models were fitted to predict smoothest
successive differences of logs of 1970-1974 numbers of
heroin acIdicts from the first principal component.
. Estimates for 1975 through 1979 were based directly on
the regression model.
. The estimates for 1980 and onward are described as
"guesstimates." For example, the estimate for 1980 was
taken to be the same as for 1976 because the DAWN
reports of heroin-related emergency room admissions
were similar for the two years, and the DAWN indicator
was believed to be a strong correlate of the number of
heroin addicts. The estimate for 1981 was based on
quadratic extrapolation from the estimates for 1977-
1980.
Despite the commendable ingenuity shown in the methoclology
for developing the estimates of the number of IVDUs in New York
City, the estimates must be regarded as highly suspect. Of most
concern are the "guesstimates" used to adjust the Registry data
and the fact that the current estimates of the numbers of IVDUs
6The adjustments for false positives people newly reported who had not been pre-
viously reported to the Register were applied to the number denoted as c in the
"Overview." An adjustment for "inactivation"—to correct for the fact that not all
previously identified persons will persist in their status (due to death, incarceration,
cessation of addiction) was applied to the number denoted as b in the "Overview."
Both adjustments are described as "guesstimates," especially the second.
OCR for page 443
NUMBER OF IV DRUG USERS ~ 443
are largely "guesstimates." Although New York City's application
of the dual-systems method may be better than most (due to high
coverage of the Register), the combination of errors in the dual-
systems estimates used to fit the regression is also of concern, as is
the change in the viability of the regression mode] over time. Of
course, the regression estimates were not used directly for making
estimates for the l980s, but they clo enter into the development of
the "guesstimates."
The estimates for New York State are even less accurate, due to
error in the factor F describ~ec! above.
National Estimates
NIDA also explored some arithmetic with national estimates to derive
an estimate of 1.1 million TVDUs.7
500,000
+250,000
+475,000
-150,000
+25?000
estimated heroin addicts in 1982
heroin {V users, not addicts (NIDA estimate)
cocaine heavy users
overlap in cocaine/heroin use
nonheroin, noncocaine IV users (NTDA estimate)
1,100,000 total.
The author is unaware of the descriptions of the methodology under-
Tying these estimates.
Lange's Approach
Lange utilized several assumptions to develop an estimate of the to-
tal number of IVDUs (and their seroprevalence, which is interesting
but will not be discussed here). His estimate of the total number
of IVDUs is 1.6 million. His first assumption is that the propor-
tion of IVDUs in the general (adult?) population is fairly stable
among cities of comparable sizes, and he assumes ratios of 1/25 for
cities larger than one-half million and 1/30 for those between 300,000
ant! 500,000. (These ratios may be interpreted as number of IVDUs
in cities in a given size range divided by total population in those
7The source for this is an unauthored, undated document "Estimated Number of IV
Drug Abusers in U.S." The estimated number of heroin addicts in 1982 is taken from
Shreckengost (1983~; estimates for the number of heroin IV users who are not addicts
and the number of users of IV drugs other than heroin and cocaine are supplied by
NIDA; the estimate of the number of cocaine heavy users was reported in the 1985
National Household Survey on Drug Abuse.
OCR for page 444
444 ~ BACKGROUND PAPERS
cities.) The first assumption is based on "the opinion of many that
one-half of {VDU's live in New York City ... and that between 70-
75'7o of them reside in 24 of the largest metropolitan areas...." His
second! assumption is that 95 percent of the IVDUs in the United
States reside in the 50 largest cities. The estimate of 1.6 million
follows (W. Robert~Lange, personal communication, 1988~. There
is probably more heterogeneity in the proportions of {VDUs in the
cities than Lange's figures imply, although that flaw need not be
critical for estimating the total number of {VDUs across the cities.
The accuracy of Lange's estimate is unknown.
Newmeyer's Ratio Estimate
John Newmeyer recently developed a method for estimating the
number of {VDUs in Standarc] Metropolitan Statistical Areas
(SMSAs) reporting to the DAWN system (Newmeyer, 1988~. Implicit
in the method is an estimator for the number of IVDUs nationally.
Newmeyer's method is essentially as follows. Let
Hi
vi
Yi = total number of {VDUs in SMSA i (to be esti-
mated),
total reported emergency room mentions of
heroin/morphine for SMSA i,
total meclical examiner mentions of opiates for
SMSA i, except for Newark and Chicago
(whose data were deemed problematic),
U* = sum of Ui over all SMSAs except Newark and
Chicago,
V* = sum of Vi over all SMSAs except Newark and
Chicago,
Vi = u v. for Newark and Chicago SMSAs,
U** = sum of Ui over all SMSAs,
V** = sum of Vi over all SMSAs. and
Xi =
, ,,
5 ~ ui + vi ~
A** v**
Newmeyer defines Xi as uUi* for Newark and Chicago and as
.5( ui* + v; ~ for the other SMSAs, which is theoretically slightly
inferior but probably of no practical importance. Newmeyer then
derives an external estimate of the number of {VDUs in New York
City, ANY, which he refers to as an anchor. The ratio R = ANY/XNY
is then computecl. The estimated number of {VDUs for SMSA i is
simply Yi = RXi, and the estimated number for all 17 SMSAs is
OCR for page 445
NUMBER OF IV DRUG USERS ~ 445
simply R. The method could be used with SMSAs other than New
York as the anchor.
Suppose that Yi - Taxi + ci, with si having zero mean. The
accuracy of the method depends critically on (1) the accuracy of the
anchor ANY and (2) the stability of the assumed relationship (i.e.,
the variance of ci). Both-factors are important. Even if ANY were
known accurately (Newmeyer uses 160,000 for ANY), if SNY is very
large (small) then the overall estimate will be too large (small). One
could attempt to improve the method by using principal components
of several indicators or a multivariate ratio estimator, but the critical
dependence on items (1) and (2) wouIcl remain.
Newmeyer's estimated anchor for New York is critical and may
be controversial. As he notes (Newmeyer, 1988),
There seems to be a consensus among experts in New York that there are
about 200,000 IVDUs in that metropolis. I disagree. I have worked hard
at modelling the AIDS epidemic among IVDUs in New York, and find that
a base population of 120,000 users is as large a figure as can adequately
account for the observed small size of their AIDS caseload. Even to make
that base number work, I have to assume (1) that the 1982-83 estimates
of their HIV seropositivity were too high, (2) that their rate of progression
from infection to AIDS diagnosis is no faster than for the gay men in
the San Francisco hepatitis study, and (3) that fully 35% of all New York
IVDUs who die of HIV-related causes are not enumerated in the AIDS
caseload. In this paper, I have used a midpoint between my 120,000 and
the 200,000 figure of New York experts. Since the whole analysis depends
on the New York "anchor," my estimate of the nation's infected IVDU
population would be 250X0 higher if the New York experts are right but it
would be 2558 lower if my New York model is the correct one.
Lange's estimate of the number of {VDUs in the 16 SMSAs with
the largest Xi is 910,000 compared to Newmeyer's national figure of
625,000. Newmeyer estimated a total of 140,000 seropositive {VDUs
in the 16 SMSAs. Using New York City's estimate of 240,000 for
the number of IVDUs there would increase Newmeyer's figures by 50
percent.
CONCLUSIONS
The accuracy of the estimates of the number of IVDUs is not objec-
tively ascertainable, but the estimates (of about 1 million) could well
be off by a factor of 2; that is, the true number could conceivably
be smaller than 500,000 or greater than 2 million. The closeness of
several of the estimates is not persuasive because they cannot be
regarded as independent estimates. The question that persists is who
counts as an IVDU; what definition is being used? The judgment
made here about the accuracy is baser! on a review of the estimation
OCR for page 446
446 ~ BACKGROUND PAPERS
methods. Other reasonable people could read the review presented
here and come to different conclusions.
REFERENCES
Brookmeyer, R., and Gail, M. H. (1986) Minimum size of the acquired immunodefi-
ciency syndrome (AIDS) epidemic in the United States. Lancet 2:132~1322.
Brookmeyer, R., and Gail, M. H. (1988) A method for obtaining short-term projections
and lower bounds on the size of the AIDS epidemic. Journal of the American
Statistical Association 83:301-308.
Butyuski, W., Record, N., Bruhn, P., and Canova, D. (1987) State Resources
and Services Related to Alcohol and Drug Abuse Problems, Fiscal Year 1986.
Washington D.C.: NASADAD.
Centers for Disease Control (CDC). (1987) Human immunodeficiency virus infection
in the United States: A review of current knowledge. Morbidity and Mortality
Weekly Report 36(Suppl. Sac: 1-48.
Dougherty, J. (1987) Estimates of Numbers of IV Drug Users Infected with HIV:
Preliminary Data. Unpublished manuscript, December 5.
Gerstein, D. R. (1976) The structure of heroin communities in relation to methadone
maintenance. American Journal of Drug and Alcohol Abuse 3:571-587.
Medley, G. F., Anderson, R. M., Cox, D. R., and Billard, L. (1987) Incubation period
of AIDS in patients infected via blood transfusion. Nature 328:719-721.
Mosteller, F. (1977) Assessing unknown numbers: Order of magnitude estimation. In
W. Fairley and F. Mosteller (eds.), Statistics and Public Policy. Reading, Mass.:
Addison-Wesley.
Newmeyer, J. (1988) Estimating the Total of Seropositive IVDUs from SMSA Data.
Unpublished manuscript, May 1988.
Schmeidler, J., and Frank, B. (n.d.) Estimating the Number of Narcotic Abusers.
New York State Division of Substance Abuse Services, undated.
Schmeidler, J., Frank, B., Johnson, B., and Lipton, D. S. (1978) Seeking truth in
heroin indicators: The case of New York City. Drug and Alcohol Dependence
3:345-358.
Shreckengost, R. C. (1983) Heroin: A New View. Washington, D.C.: U. S. Central
Intelligence Agency.
Woodward, J. A., Retka, R. L., and Ng, L. (1984) Construct validity of heroin abuse
indicators. The International Journal of the Addictions 19:93-117.
Wolter, K. M. (1986) Some coverage error models for census data. Journal of the
American Statistical Association 81:338-346.
Representative terms from entire chapter:
drug users