Read "AIDS, Sexual Behavior, and Intravenous Drug Use" at NAP.edu

« Previous: Background Papers

Page 429 Cite

Suggested Citation:"On the Accuracy of Current Estimates of the Numbers of Intravenous Drug Users." National Research Council. 1989. AIDS, Sexual Behavior, and Intravenous Drug Use. Washington, DC: The National Academies Press. doi: 10.17226/1195.

Page 430 Cite

Page 431 Cite

Page 432 Cite

Page 433 Cite

Page 434 Cite

Page 435 Cite

Page 436 Cite

Page 437 Cite

Page 438 Cite

Page 439 Cite

Page 440 Cite

Page 441 Cite

Page 442 Cite

Page 443 Cite

Page 444 Cite

Page 445 Cite

Page 446 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

On the Accuracy of Estimates of Numbers of Intravenous Drug Users Bruce D. Spencer The purpose of this paper is to assess the accuracy of estimates of the numbers of intravenous drug users (IVDUs) that were published in the November 30, 1987 report "A Review of Current Knowledge and Plans for Expansion of HIV Surveillance Activities" (hereafter, the "Review"), submitted by the Centers for Disease Control (CDC) to the Domestic Policy Council.i The purpose is not to develop alterna- tive estimates but rather to see whether the published estimates are accurate enough to be relied upon for estimating and forecasting the extent of human immunodeficiency virus (HIV) infection. As will be made clear, the published estimates are fraught with problems. The accuracy of the estimates is not ascertainable by objective means because the estimates are based largely on guesses. In the author's judgment, the estimates could well be off by a factor of 2, in either direction. Although this paper will appear to be highly critical of the ways that various researchers and agency personnel have estimated numbers of IVDUs, appreciation and respect are owed to researchers for producing the estimates that we have of numbers of {VDUs. Considerable ingenuity has been used by a number of researchers The author is from the Department of Statistics, Northwestern University, and the Methodology Research Center, NORC. He is grateful for helpful comments from Lincoln Moses, John Newmeyer, I. Richard Savage, and James Schmeidler. 1The report later appeared as the supplement to the December 18, 1987 Morbidity and Mortality Weekly Report, 36, No. S-6. 429

430 ~ BACKGROUND PAPERS in attacking this difficult estimation problem. This paper does not say that the estimates should have been produced by alternative procedures; indeed, few constructive suggestions are made. Rather, it concludes that the estimates are simply highly inaccurate and form a weak basis for any policy or program decisions. Before proceeding to discuss the estimates themselves, the reader should understand the statistical concept of accuracy. The following discussion is intended to elucidate, rather than confuse; the reader who wishes to skip it should proceed to the next section. Let X refer to the publisher] estimate of the number of IVDUs and let ~ refer to the true number of IVDUs, that is, the number to be estimated. One may think of X as fixed and known, since it is published, and of ~ as fixed but unknown. The error in X, namely c, is the difference between X and 8; ~ = X - §. When the accuracy of X is referred to, a statement is being made about beliefs concerning the magnitude of e. Except on rare occasions when ~ is known with a much smaller error than c, the value of ~ is uncertain. This uncertainty will be represented by a probability distribution for c. The key features of the distribution are its mean, which measures the bias or systematic error in X, and its variance, which measures the unreliability of X. The bias may be decomposed into the sum of two component biases, one arising because X is designed to measure the wrong concept (i.e., the wrong definition of {VDU underlies X) and the other arising from systematic error in the measurement process. Summary measures of accuracy may be developed in various ways, but the most common is root-mean-square error (RMSE), the square root of the sum of the variance and the squared bias. Two alternative interpretations of probability will now be dis- cussed. Uncler one interpretation, the "relative frequency" (RF) in- terpretation, the estimate X is viewed as a realization of a stochastic process; for example, if X is based on the results of randomly selected samples, then the randomization inherent in the sampling gives rise to a probability distribution for X (i.e., different samples would give rise to different estimates of X). Other, nonsampling bases for error in X can also be endowed with a randomization distribution; for example, response errors in surveys can be viewed as occurring at random (perhaps with bias as well as variance). For another example (see "Newmeyer's Ratio Estimate" ), consider ratio-estimators X of the form X = AY, where Y is the number of reports of drug-related emergency room episodes reported in the Drug Abuse Warning Net- work (DAWN) and A is an estimate of A, with A defined as the ratio of ~ to the expected value of Y. Here Y is random and A has error

NUMBER OF IV DRUG USERS ~ 431 when viewed as an estimate of A, and the latter error and Y may be correlated. Not all estimates X are conducive to the RF interpretation; for example, some estimates of X are called "informed guesstimates" (never uninformed guesstimates, though) and are not produced by a repeatable process. In such cases, a relevant notion of probability is based on the view that a person's beliefs about ~ may be represented by a probability distribution; this interpretation will be referred to as a "personalistic Bayesian" (PB) perspective. The PB viewpoint explicitly incorporates subjective opinion, whereas the RF viewpoint does not. However, application of the RF model inevitably uses subjective opinion in the specification of the randomization model to be used to assess the mean and variance of the distribution. In either case, RF or PB, the problem of whose opinion arises. In this paper, the author's opinion is represented. Coming from a nonexpert in the area of drug abuse, these opinions represent those of experts whose views have been encountered as well as a sufficient degree of skepticism acquired from experience in assessing the accuracy of demographic statistics and from watching experts be overconfident about their expertise (see also Mosteller, 1977, in this regard). WHAT IS THE DEFINITION OF INTRAVENOUS DRUG USER? The first problem with estimates of the number of intravenous drug users is the definition of {VDU. Many {VDUs shoot opioi(ls, prin- cipally heroin, ant! many do not. W. R. Lange (personal commu- nication, 1988) believes that, at least in some communities such as Baltimore, "approximately 20%0 of IVDUs are primary cocaine and amphetamine users who (lo not on a regular basis abuse opioirls in- travenously." Many varieties of patterns of needle use exist. Some persons inject drugs subcutaneously ("skin poppers") and do not even use needles; although they are not {VDUs if they do not use needles, they are at risk of HIV infection to the extent that they share injection equipment with drug users who are infected! with HIV. Some IVDUs are hard-core addicts who inject drugs multiple times each day; other IVDUs are successful upper middle class users, who inject less frequently. Other {VDUs inject multiple times daily for a few months and then stop; still others inject only a few times per year. Some {VDUs have been shooting for a Tong time; others, only recently. Whether an IVDU clid most injecting prior to 1975 or in a later year will greatly affect the risk of HIV infection.

432 ~ BACKGROUND PAPERS Gerstein (1976) distinguishes] four types of heroin-using {VDUs. The first two types are hard-core users who are "strung out"; the third type consists of "people with dual identities . . . who may be strung out, but often at Tow levels.... These are people whose status - as members of the heroin community is often secret, hidden from a whole set of close associations: family and friends. These may or may not be identified on the street as addicts, depending on their mode of scoring ant! the stage of their careers." The fourth type is {VDUs who are not usually strung out on heroin. "These may be 'chippies', 'weekenders', or 'sandbox users' as opposed to 'boot-and- shoe addicts', 'stone junkies', '(lope fiends', and 'real hypes'. They may or may not score directly on the street.... Sometimes these peoples are situational heroin users the current lover of a strung- out user, for example." Gerstein believes, based on field experience and study of the epidem~ological literature on heroin that there are about two or three type 4 IVDUs for every type 1-3 IVDU (Gerstein, 1976). What is the right choice of definition? For estimating the number of seropositive persons in the United States, the precise definition is largely irrelevant as Tong as it is consistent with other definitions being employed. For example, Table 14 in the "Review" estimates the total num- ber of infected people in eight (presumably nonoverIapping) popula- tion subgroups: 1. exclusively homosexual, 2. 3. other homosexual, regular IVDU, 4. occasional {VDU, 5. hemophiliac A, 6. hemophiliac B. 7. heterosexuals without specific identified risks, and 8. others. The estimates of total infected persons in each group are derived by multiplying the number of persons in the group by the seropositivity rate and summing. Thus, if Rj is the seropositivity rate for group j and if Nj is the size of group j, the total number of infected persons in the population is Rat Nit + R2N2 + R3N3 + R4N4 + · · + R8Ns, . If group 4 were split into two groups, say 4a and 4b, with rates R4a and R4b and sizes N4a and N4b, then the total number of infected persons in the population may be expressed as:

NUMBER OF IV DRUG USERS ~ 433 RiNi + R2N2 + R3N3 + R4aN4a + R4bN4b + + RsNs, which equals the previous total because R4 = (R4aN4a + R4bN4b)/N4 and N4 = N4a+N4b. Thus, changing the definitions or compositions of the groups does not affect the estimate of the total number infected as long as the group sizes Nj and seropositivity rates Rj are appropriate to the group as defined. (Of course the groups should comprise the whole population ant! be nonoverIapping.) This is not to say that for other purposes the definition is ir- relevant. For example, for studying transmission of the infection, one would wish to estimate seroprevaTence for TVDUs classified by frequency of injection, how long they had been shooting, whether they shared needles and with whom, if and how long they had been infected, the stage of the infection, and so on. In this case, the defini- tion should be focused so that the groups of {VDUs are homogeneous. Thus, the definition of IVDU might well exclude or otherwise dis- tinguish chippies from other users. The definition implicitly used by Newmeyer (1988) in his method (clescribed later) has this feature. WHAT DEFINITIONS ARE USED IN THE "REVIEW"? The "Review" distinguishes two types of IVD Us: regular users who inject at least weekly and less frequent or occasional users who inject less often than regular users but have used drugs more than once or twice. Two sets of national estimates were produced for each type of IVDU, the original Public Health Service estimates for 1986 ~ "Review," Table 13) and the revised estimates ("Review," Table 14) for 1987. These are listed in Table 1. TABLE 1 Public Health Service (1986, 1987) Estimates of HIV Prevalence Among IV Drug Users Type of IV Drug User Estimated Approximate Total Number Number Seroprevalence (%) Infected Original estimate (1986) Regular 750,000 30 225,O00 Occasional 750,000 10 75,000 Total 1,500,000 300,000 Revised estimate (1987) Regular 900,000 25 225,000 Occasional 200,000 5 10,000 Total 1,10O,000 235,000

434 ~ BACKGROUND PAPERS In assessing these estimates, critical questions concern what pop- ulation definitions were actually used in estimating the total number of each type of I~DU and in estimating the seroprevaTence rates. How different are the operational definitions? How accurate are the estimates for the populations as operationally defined? WHAT POPULATION DEFINITION WAS USED FOR ESTIMATING SEROPREVALENCE? SeroprevaTence rates were estimated by a weighted average of sero- prevalence rates for each state, with the weights proportional to the estimated number of {VDUs in the state (cliscussed later). Seropreva- lence estimates for a state were based on rates observed in specific locations, primarily from data obtained from intraveneous (IV) drug abuse treatment facilities in major cities (Dougherty, 1987~; presum- ably these rates were not simply averaged across the entire state. The data were mostly gathered in 1986 and 1987, but some were gathered in 1984 and 1985. Because seroprevalence rates may change rapidly upward in only a year (as seems to have occurred in New York and Edinburgh), even the 1986 data are out of date for estimating sero- prevaTence in 1987, let alone 1988. The effect of outdated data on local seroprevaTence rates may be to bias the estimates downward to an unknown but possibly appreciable extent. John Dougherty has suggested that the rates, which "are from the few drug abuse treatment facilities in which planned studies have been conducted," may be underestimates. He noted that many "drug treatment professionals report that IV drug users more likely to be infected with HIV (i.e., engage in more risky behaviors) tend not to seek treatment at drug centers where HIV testing is known to occur but prefer programs in which there is not testing." However, the extent of the avoidance of such treatment programs depends on how much real choice a user has of treatment program and whether testing is optional. Dougherty (1987) also notes that "several other lines of evidence indicate that seroprevalence rates are higher for drug users not receiving treatment than for those in treatment programs" (although this author has not seen such evidence). Furthermore, the (lrug treatment programs include many persons who no longer use needles to inject drugs. At first blush, these considerations would seem to indicate that the estimated seroprevaTence rates would be low estimates (for the cities in which they were based). However (even if the judgment of drug treatment professionals is accepted for the moment), merely because the {V drug users more likely to be

NUMBER OF IV DRUG USERS ~ 435 infected tend not to seek treatment at the centers in the studies does not mean that the rates are underestimatecI since the chippies and other occasional users who may have very Tow rates of infection- might be even more underrepresentecT in the studies. Whether the rates are underestimated due to underrepresentation of the whole IV drug-using population or not depends on who should be included in the denominator of the rate, and that, as noted earlier, depends on the inclusiveness of the definition of IVDU. The suggestion of Newmeyer (1988) that the definition of IVDU exclude chippies has merit; it is important, however, that they be exclucled both from the clefinec! population of IVDUs and from the tested groups used to estimate seroprevalence of IVD Us. The kinds of settings in which the seroprevaTence estimates for IVDUs were made inclucle methadone maintenance programs, drug- free treatment programs, detoxification programs, the "street" (sero- prevaTence testing performed in addicts not currently in treatment programs), and other kinds of drug treatment programs. The clinics and treatment programs were not selected at random, nor were mem- bers in the programs selected at random. How well do the popula- tions "sampled" match the population definitions used for developing numbers of regular and occasional {VDUs? This is not known, but there is no reason why they should be close; it has been impossible to pin clown any expert with an opinion on this question. As shown in the following section, the population definitions used for estimates of numbers of {VDUs are exceedingly vague regarding frequency of injection. How exactly were the seroprevaTence estimates developed for each state and how were the differences between seroprevaTence rates for regular and occasional users estimated? Again, we c30 not know. HOW WERE NUMBERS OF IVDUs ESTIMATED? Overview The revised estimate of 1.1 million IVDUs was chosen by the CDC from among several alternative estimates available at the time. The alternative methods of estimation underlying the numbers in the "Re- view" are Ascribed in the next three sections. These methods are based on aggregation of state agencies' estimates, on national-level estimates for different kinds of (drug abusers, and on the assumption (Lange's approach) that the proportions of the adult population in large cities who are IVDUs are constant within classes of cities. The

436 ~ BACKGROUND PAPERS Newmeyer ratio estimate represents an alternative method that was not considered by the CDC in choosing its estimate. Three general kinds of techniques are used for estimating the numbers of{VDUs. Terminology varies somewhat, but the practice of the National Association of State Alcohol and Drug Abuse Directors (NASADAD) will be used (Butynski et al., 1987:41-42) to classify them as "direct," "indirect," or "informed guesstimates." Direct estimates may be based on the National Household Survey on Drug Abuse (NHS) run by the National Institute on Drug Abuse (NTDA), on dual-systems estimates, or on backward extrapolation. Indirect estimates are based on a fitted regression model that attempts to relate indicator data such as number of burglaries or heroin-related deaths to prevalence of IVD Us. Estimates of numbers of {VDUs are obtained by substituting observed (or predicted) values of the indicators into the regression model. Note that the estimation of the regression coefficients depends on the availability of direct estimates. Finally, so-called informed guesstimates are produced by one or more people looking at any available indicators or other correlates of TVDU prevalence and making a loose guess about the number of TVDUs. Both the indirect estimation and the informed guesstimation are dependent on direct estimates somewhere along the line. Thus, the accuracy of any estimates of the number of {VDUs can be no greater than the accuracy of the direct estimates. A brief description of the three direct estimation methods is now provided. The NHS attempts to measure drug use and prevalence in the general household population aged 12 and over. The survey excludes by design persons living in transient households or in institutions (in- cluding university dormitories and prisons), persons in the military, ant! persons with no fixed residence. Many heroin users will, there- fore, not be covered by the survey. Also, even within the population not excluded, the survey suffers from biases due to nonresponse and underreporting of (lrug use. Thus, cocaine and amphetamine users may be covered by the survey but may decline to report themselves as drug abusers. The extent of bias in the survey results is unknown but potentially enormous.2 Back-extrapolation methods use data on AIDS deaths and HIV 2For example, the decennial census is believed to miss as many as 15 percent of all black males aged 20-44; the percentage of IVDUs missed by the census is surely much higher. If 30 percent of the drug abusers are not covered by the census, and if only one in two drug abusers surveyed would admit being a drug abuser, then the number of drug abusers reported by the survey will tend to be only 35 percent of the actual number.

NUMBER OF IV DRUG USERS | 437 prevalence to estimate the number of IVDUs. Based on seropreva- lence studies and on models of disease progression, estimates of the probability that an IVDU will have had AIDS are represented, for example, by P. Then estimates of the number of AIDS cases among {VDUs are developed, N. The number of TVDUs is then estimated by the ratio N/P. Such a technique was used by Newmeyer (1988) to estimate the number of IVDUs in New York City; his method is discussed at the end of this sec-tion.3 Difficulties with the method are attributable to errors in the estimates P ant! N. The estimate P-is based on nonrandom samples subject to severe but unknown selection biases, as well as errors in the model of disease progression. The estimate N must successfully correct for underreporting of AIDS deaths. In- or out-migration of IVDUs from the geographic area of interest is also a problem, but probably of lesser magnitude. An acivant age of this technique is the consistency in the definition of {VDU, as discussed earlier. Dual-systems estimates, also known as tag-recapture estimates, are perhaps the most wiclely used direct estimates. One begins with two lists of {VDUs, for example, a list of persons in heroin treatment programs and a list of persons treated in emergency rooms for ad- verse reactions to heroin abuse. If there are a people on both lists, b people on the first list but not the second, and c people on the second list but not the first, then one may estimate the probability P that an {VDU is on the first list by the fraction of people on the second list who are also on the first list, P = a/(a + c). The number of TVDUs who are on the first list is N = a + b. Then the total number of IVDUs is estimated by N/P, or (a + b)(a + c)/a. A number of problems with (lual-systems estimates have been recognizecl, both in the drug abuse literature and in the statistical literature (WoTter, 1986~. Problems include errors in classifying individuals as to membership on one list 3Newmeyer also describes a variation on this method in his unpublished note "Four Readily Applicable Methods to Estimate Drug Abuse Prevalence" (personal communi- cation, 1988~: The method is to ascertain from the medical examiner's "coroner's] toxicol- ogy data the relative proportion of decedents with methadone metabolites to decedents with heroin metabolites (metabolite is simply a chemical produced by the body's metabolism of a drug.) If this ratio is, say, 1 to 5, and if it is known that there are 2,000 persons receiving methadone maintenance in the city, it can be hazarded that there are about 10,000 persons attempting to maintain themselves on heroin. Note that the validity of this method rests on an assumption of equal death rates for methadone and heroin users. For other discussions of the back-extrapolation method, see Brookmeyer and Gail (1986, 1988) and Medley et al. (1987~.

438 ~ BACKGROUND PAPERS or two, possible multiple appearances of an IVDU on a list, variabil- ities in the selection probabilities for different-types of individuals (some IVDUs are more likely to appear on the list than others), migration of TVDUs in and out of the geographic area between the times that the two lists are created, and causal effects of being on one list enrollment in a treatment program may lessen the chance of a drug-related emergency room incident. The errors arising from variability in selection probabilities and causal effects are sometimes jointly referred to as "correlation bias." Triple-systems estimates are sometimes advocated (Woodward et al., 1984), but the same basic problems remain. Aggregation of State Agencies' Estimates State drug abuse authorities submit plans for treating drug abuse to the Alcohol, Drug Abuse, ant! Mental Health Administration (ADAMHA). These plans contain estimates of numbers of IVDUs in each state. The numbers are developed by the states from a variety of sources, including surveys, treatment data, and drug indicator data. It is worth noting that the state agencies compile the num- bers for political purposes, inclucling obtaining block grant funds from ADAMHA (which would put upward! pressure on the numbers) and informing state legislators (which could put upward or down- ward pressure on the numbers, depending on the state). NIDA staff reviewed the states' plans and interviewed the agency directors in California, Illinois, Michigan, New Jersey, New York, and PennsyI- vania to confirm the numbers reported in the plans. Those reviews merely verifier! that the numbers were reported correctly and not that the numbers reported were accurate. In this way, the total number of {VDUs was estimated at 1.28 million. Also, accorcTing to Butynski et al. asked each state alcohol and drug agency to: (1987:41-42), NASADAD provide estimates relating to IV drug abuse for Fiscal Year (FY) 1986 for the total number of client admissions to treatment and total number of IV drug abusers in the.State.... Seventeen (17) states provided data on the total number of IV drug abusers in the State.4 The highest estimates of IV drug abusers were provided by New York, California, and Texas, in that order.... States were also asked to report the basis for their estimates of 4The following states were excluded: Alabama, Alaska, Arizona, Arkansas, Colorado, Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Indiana, Kansas, Ken- tucky, Louisiana, Maine, Montana, Nebraska, Nevada, New Mexico, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Puerto Rico, South Carolina, South Dakota, Tennessee, Utah, Vermont, Virginia, Virgin Islands, Washington, West Virginia, and Wyoming.

NUMBER OF IV DRUG USERS ~ 439 the total drug abuser population. The largest number of responding States t10] reported that their estimates were based upon indirect measures or indicators. Three States reported that only "guesstimates" were used, and four States indicated that a combination of methods [i.e., a combination of a "guesstimate" and another method] was used. The total number of IVDUs reported for the 17 states was 1.067 million. If the state plan estimates for the other 33 states, Puerto Rico, the District of Columbia, and the Virgin Islands are included, the total increases to 1.447 million. The national estimates described above can be no better than the state estimates they are based on. Individual states use quite different methods to estimate the number of TVDUs. Estimation methods for three states (Illinois, California, and New York) and three cities (Chicago, Los Angeles, and New York) are now described. Illinois and Chicago The estimate for Illinois is based on data from the Client Oriented Data Acquisition Process (CODAP) and the National Householc! Survey on Drug Abuse. CODAP collects limiter! demographic and drug-use information about clients in participating federally or state- funded drug abuse programs. Illinois uses the CODAP and the NHS data to estimate several components. First, the Illinois Department of Alcoholism ant! Substance Abuse uses the CODAP data for all par- ticipating programs to estimate the fraction Fit of narcotics abusers in Illinois who inject drugs. Second, the proportion F2 of the U.S. population who are narcotics abusers is estimated from data col- lected by the NHS. Finally, the estimate of the number of IVDUs in Illinois is derived as PF2 x Fit, with P equal to the total popula- tion of Illinois. The method for estimating the number of IVDUs in Chicago is analogous, with Chicago figures used instead of Illinois figures (Jerome Gross, personal communication, May 20, 1988~. The downward! bias in the NHS-based estimate of F2 makes the estimates very untrustworthy. - J California and Los Angeles The estimates of the number of IVDUs in California are formed as the sum of the estimates of numbers of IVDUs shooting amphetamine and cocaine and those shooting opiates. Apparently, persons who in- ject more than one of these ([rugs are counted more than once. First, let us consider estimation of amphetamine- and cocaine-shooting IV- DUs. NHS 1983 data from the western region of the country were

440 ~ BACKGROUND PAPERS used to estimate a prevalence rate for the use of amphetamine and cocaine. The estimated prevalence rate was multiplied by the size of the at-risk population to estimate the number of amphetamine and cocaine users. Data from DAWN and the California Drug Abuse Data System (CALDADS) as well as "street data" were used to estimate the fraction of drug abusers who administered the drugs intravenously; the estimates were subjectively adjusted downward slightly to account for selection bias. Then the latter figures were multiplied by the estimated number of amphetamine and cocaine users to estimate the number of amphetamine and cocaine shoot- ers in 1985. Those figures were assumed constant at 97,000 for the years following 1985. As with TIlinois, the downward bias of the NHS prevalence rate subjects the California estimate of number of amphetamine and cocaine shooters to great downward bias. The number of persons shooting opiates is based on extrapola- tion from a nonlinear regression model fitted to data more than 10 years old. Specifically, a regression was fitted to predict estimates! prevalence rates for fiscal years 1974-1975, 1975-1976, 1976-1977, and 1977-1978 from indicators including number of drug-incluced deaths, burglaries, new admissions to treatment programs, and num- ber of hepatitis B cases. The prevalence estimates were based on work by John Newmeyer.5 A quadratic trend was fitted, which leads one to believe that the mode] was overfitted (only four data points were used to estimate at least three parameters). Rates of prevalence for future years were precTictecT from values of the indicator data from those years. The moclel was last run on fiscal year 1982-1983 data, for which the estimate was 102,500. The estimate was subjectively adjusted to 125,000 for fiscal year 1986-1987; numbers are assumed constant since then. The extent of extrapolation, overfitting, and possible weakness in the direct estimates used to calibrate the regres- sion model makes these estimates untrustworthy. The total estimate of IVDUs in California is thus 220,000. The number is taken to re- fer to hard-core users (Susan Nisenbaum, personal communication, 1988). The estimate of the number of {VDUs in Los Angeles County is based on multiplication of the number of {VDUs in treatment by a factor. The factor is about 4 or 5 and is based on examina- tion of drug indicator data, but explicit models were not used. The 5The manner in which Newmeyer's estimates were prepared is not known, but it is believed that they were based on dual-systems estimation (Susan Nisenbaum, personal communication, 1988~.

NUMBER OF IV DRUG USERS ~ 441 estimate (about 70,000) includes regular users and occasional users; including users in remission might push the estimate up to 120,000 or more (Donald McAllister, personal communication, 1988~. As the accuracy of the factor is unknown, so too is the accuracy of the estimate. However, there is no reason to assume that the accuracy is any higher than that for Chicago. New York State and New York City The estimate of the number of {VDUs in New York State is derived as the ratio of the New York City estimate to the fraction F of narcotic abusers in the state who are estimated to reside in New York City; F is estimated from the NHS (Schmeidler and Frank, no... The current estimate for New York City is 240,000. The accuracy of F may be reasonably good to the extent that biases in the NHS cancel out, since F is the ratio of two NHS estimates of narcotic abusers. The estimate of the number of IVDUs in New York City is based on extrapolation of past estimates of prevalence. In brief, the past estimates were based on regression estimation using drug indicators as predictors, and the regression models were estimated by using a combination of Narcotics Registry and clual-systems estimates. The method is complicated because some of the predictor variables became unavailable and had to be predicted on the basis of other, available indicators. The models are designed to estimate the number of heroin adcTicts, and those estimates are used without modification to estimate numbers of IVDUs in New York City. A somewhat more detailed summary of the estimation of the number of IVDUs in New York City follows (Schmeidler et al., 1978~. The discussion is longer than that for the other cities because the method is more complicated, more information was available on how New York City's estimates were prepared, and the estimates of the number of {VDUs in New York City are especially critical (see the discussion of Newmeyer's estimate, below). . The first principal component of 12 indicators of heroin use was computed and used as a predictor variable for regression estimates of numbers of heroin addicts. . Values of the "dependent variable" (numbers of heroin addicts) for 1970-1974 were provided by the numbers of heroin addicts reported in the New York Narcotics Register, adjusted for duplication, reporting of (1rugs other than heroin, incarceration, relocation, death, non- addiction, and false inclusion. These adjustments were

. 442 ~ BACKGROUND PAPERS largely guesses (particularly the adjustments for death, relocation, and incarceration) and are problematic. The numbers were also adjusted6 by dual-systems estimates of the numbers of heroin addicts not reported in the Register. Absence of data on the extent of drug abuse or demographic characteristics for persons in the Regis- ter meant that no posts/ratification could be employed and correlation bias could be high, leading to clownward pressure on the estimates. Matching errors (false non- matches) probably occurred, leading to upward pressure (see Wooc~warcT et al., 1984~. Further upward pressure is present because the population is not closed and the adjustments for out-migration from the population were imperfect. Sampling error is also present. In this way, estimates of the number of heroin aciclicts in 1970-1974 were prepared. . Then regression models were fitted to predict smoothest successive differences of logs of 1970-1974 numbers of heroin acIdicts from the first principal component. . Estimates for 1975 through 1979 were based directly on the regression model. . The estimates for 1980 and onward are described as "guesstimates." For example, the estimate for 1980 was taken to be the same as for 1976 because the DAWN reports of heroin-related emergency room admissions were similar for the two years, and the DAWN indicator was believed to be a strong correlate of the number of heroin addicts. The estimate for 1981 was based on quadratic extrapolation from the estimates for 1977- 1980. Despite the commendable ingenuity shown in the methoclology for developing the estimates of the number of IVDUs in New York City, the estimates must be regarded as highly suspect. Of most concern are the "guesstimates" used to adjust the Registry data and the fact that the current estimates of the numbers of IVDUs 6The adjustments for false positives people newly reported who had not been pre- viously reported to the Register were applied to the number denoted as c in the "Overview." An adjustment for "inactivation"to correct for the fact that not all previously identified persons will persist in their status (due to death, incarceration, cessation of addiction) was applied to the number denoted as b in the "Overview." Both adjustments are described as "guesstimates," especially the second.

NUMBER OF IV DRUG USERS ~ 443 are largely "guesstimates." Although New York City's application of the dual-systems method may be better than most (due to high coverage of the Register), the combination of errors in the dual- systems estimates used to fit the regression is also of concern, as is the change in the viability of the regression mode] over time. Of course, the regression estimates were not used directly for making estimates for the l980s, but they clo enter into the development of the "guesstimates." The estimates for New York State are even less accurate, due to error in the factor F describ~ec! above. National Estimates NIDA also explored some arithmetic with national estimates to derive an estimate of 1.1 million TVDUs.7 500,000 +250,000 +475,000 -150,000 +25?000 estimated heroin addicts in 1982 heroin {V users, not addicts (NIDA estimate) cocaine heavy users overlap in cocaine/heroin use nonheroin, noncocaine IV users (NTDA estimate) 1,100,000 total. The author is unaware of the descriptions of the methodology under- Tying these estimates. Lange's Approach Lange utilized several assumptions to develop an estimate of the to- tal number of IVDUs (and their seroprevalence, which is interesting but will not be discussed here). His estimate of the total number of IVDUs is 1.6 million. His first assumption is that the propor- tion of IVDUs in the general (adult?) population is fairly stable among cities of comparable sizes, and he assumes ratios of 1/25 for cities larger than one-half million and 1/30 for those between 300,000 ant! 500,000. (These ratios may be interpreted as number of IVDUs in cities in a given size range divided by total population in those 7The source for this is an unauthored, undated document "Estimated Number of IV Drug Abusers in U.S." The estimated number of heroin addicts in 1982 is taken from Shreckengost (1983~; estimates for the number of heroin IV users who are not addicts and the number of users of IV drugs other than heroin and cocaine are supplied by NIDA; the estimate of the number of cocaine heavy users was reported in the 1985 National Household Survey on Drug Abuse.

444 ~ BACKGROUND PAPERS cities.) The first assumption is based on "the opinion of many that one-half of {VDU's live in New York City ... and that between 70- 75'7o of them reside in 24 of the largest metropolitan areas...." His second! assumption is that 95 percent of the IVDUs in the United States reside in the 50 largest cities. The estimate of 1.6 million follows (W. Robert~Lange, personal communication, 1988~. There is probably more heterogeneity in the proportions of {VDUs in the cities than Lange's figures imply, although that flaw need not be critical for estimating the total number of {VDUs across the cities. The accuracy of Lange's estimate is unknown. Newmeyer's Ratio Estimate John Newmeyer recently developed a method for estimating the number of {VDUs in Standarc] Metropolitan Statistical Areas (SMSAs) reporting to the DAWN system (Newmeyer, 1988~. Implicit in the method is an estimator for the number of IVDUs nationally. Newmeyer's method is essentially as follows. Let Hi vi Yi = total number of {VDUs in SMSA i (to be esti- mated), total reported emergency room mentions of heroin/morphine for SMSA i, total meclical examiner mentions of opiates for SMSA i, except for Newark and Chicago (whose data were deemed problematic), U* = sum of Ui over all SMSAs except Newark and Chicago, V* = sum of Vi over all SMSAs except Newark and Chicago, Vi = u v. for Newark and Chicago SMSAs, U** = sum of Ui over all SMSAs, V** = sum of Vi over all SMSAs. and Xi = , ,, 5 ~ ui + vi ~ A** v** Newmeyer defines Xi as uUi* for Newark and Chicago and as .5( ui* + v; ~ for the other SMSAs, which is theoretically slightly inferior but probably of no practical importance. Newmeyer then derives an external estimate of the number of {VDUs in New York City, ANY, which he refers to as an anchor. The ratio R = ANY/XNY is then computecl. The estimated number of {VDUs for SMSA i is simply Yi = RXi, and the estimated number for all 17 SMSAs is

NUMBER OF IV DRUG USERS ~ 445 simply R. The method could be used with SMSAs other than New York as the anchor. Suppose that Yi - Taxi + ci, with si having zero mean. The accuracy of the method depends critically on (1) the accuracy of the anchor ANY and (2) the stability of the assumed relationship (i.e., the variance of ci). Both-factors are important. Even if ANY were known accurately (Newmeyer uses 160,000 for ANY), if SNY is very large (small) then the overall estimate will be too large (small). One could attempt to improve the method by using principal components of several indicators or a multivariate ratio estimator, but the critical dependence on items (1) and (2) wouIcl remain. Newmeyer's estimated anchor for New York is critical and may be controversial. As he notes (Newmeyer, 1988), There seems to be a consensus among experts in New York that there are about 200,000 IVDUs in that metropolis. I disagree. I have worked hard at modelling the AIDS epidemic among IVDUs in New York, and find that a base population of 120,000 users is as large a figure as can adequately account for the observed small size of their AIDS caseload. Even to make that base number work, I have to assume (1) that the 1982-83 estimates of their HIV seropositivity were too high, (2) that their rate of progression from infection to AIDS diagnosis is no faster than for the gay men in the San Francisco hepatitis study, and (3) that fully 35% of all New York IVDUs who die of HIV-related causes are not enumerated in the AIDS caseload. In this paper, I have used a midpoint between my 120,000 and the 200,000 figure of New York experts. Since the whole analysis depends on the New York "anchor," my estimate of the nation's infected IVDU population would be 250X0 higher if the New York experts are right but it would be 2558 lower if my New York model is the correct one. Lange's estimate of the number of {VDUs in the 16 SMSAs with the largest Xi is 910,000 compared to Newmeyer's national figure of 625,000. Newmeyer estimated a total of 140,000 seropositive {VDUs in the 16 SMSAs. Using New York City's estimate of 240,000 for the number of IVDUs there would increase Newmeyer's figures by 50 percent. CONCLUSIONS The accuracy of the estimates of the number of IVDUs is not objec- tively ascertainable, but the estimates (of about 1 million) could well be off by a factor of 2; that is, the true number could conceivably be smaller than 500,000 or greater than 2 million. The closeness of several of the estimates is not persuasive because they cannot be regarded as independent estimates. The question that persists is who counts as an IVDU; what definition is being used? The judgment made here about the accuracy is baser! on a review of the estimation

446 ~ BACKGROUND PAPERS methods. Other reasonable people could read the review presented here and come to different conclusions. REFERENCES Brookmeyer, R., and Gail, M. H. (1986) Minimum size of the acquired immunodefi- ciency syndrome (AIDS) epidemic in the United States. Lancet 2:132~1322. Brookmeyer, R., and Gail, M. H. (1988) A method for obtaining short-term projections and lower bounds on the size of the AIDS epidemic. Journal of the American Statistical Association 83:301-308. Butyuski, W., Record, N., Bruhn, P., and Canova, D. (1987) State Resources and Services Related to Alcohol and Drug Abuse Problems, Fiscal Year 1986. Washington D.C.: NASADAD. Centers for Disease Control (CDC). (1987) Human immunodeficiency virus infection in the United States: A review of current knowledge. Morbidity and Mortality Weekly Report 36(Suppl. Sac: 1-48. Dougherty, J. (1987) Estimates of Numbers of IV Drug Users Infected with HIV: Preliminary Data. Unpublished manuscript, December 5. Gerstein, D. R. (1976) The structure of heroin communities in relation to methadone maintenance. American Journal of Drug and Alcohol Abuse 3:571-587. Medley, G. F., Anderson, R. M., Cox, D. R., and Billard, L. (1987) Incubation period of AIDS in patients infected via blood transfusion. Nature 328:719-721. Mosteller, F. (1977) Assessing unknown numbers: Order of magnitude estimation. In W. Fairley and F. Mosteller (eds.), Statistics and Public Policy. Reading, Mass.: Addison-Wesley. Newmeyer, J. (1988) Estimating the Total of Seropositive IVDUs from SMSA Data. Unpublished manuscript, May 1988. Schmeidler, J., and Frank, B. (n.d.) Estimating the Number of Narcotic Abusers. New York State Division of Substance Abuse Services, undated. Schmeidler, J., Frank, B., Johnson, B., and Lipton, D. S. (1978) Seeking truth in heroin indicators: The case of New York City. Drug and Alcohol Dependence 3:345-358. Shreckengost, R. C. (1983) Heroin: A New View. Washington, D.C.: U. S. Central Intelligence Agency. Woodward, J. A., Retka, R. L., and Ng, L. (1984) Construct validity of heroin abuse indicators. The International Journal of the Addictions 19:93-117. Wolter, K. M. (1986) Some coverage error models for census data. Journal of the American Statistical Association 81:338-346.

Next: Monitoring the Spread of HIV Infection »

AIDS, Sexual Behavior, and Intravenous Drug Use (1989)

Chapter: On the Accuracy of Current Estimates of the Numbers of Intravenous Drug Users

Welcome to OpenBook!

Get Email Updates