that can substantially protect against the introduction of systematic influences (i.e., bias) other than the intervention condition under consideration. High ratings on credibility are usually given to comparative designs that involve: (1) random assignment of participants to conditions (i.e., needle exchange program versus "usual services"); (2) minimal attrition of participants from being measured; (3) measurement procedures that minimize the role of response biases; and (4) sufficient statistical sensitivity (i.e., statistical power).
It is unlikely that evaluations of needle exchange programs will ever be carried out with ideal controls that warrant high confidence in the conclusions that can be drawn from a single definitive study. There are at least two broad reasons for this: (1) multiple actions generally are initiated in a given community setting, making it difficult to separate the effects of a needle exchange program from those of other prevention efforts by studying time trends and (2) the development of a comparative research design that relies on random assignment of individuals to receive needle exchange program services (or not) has technical, ethical, and logistical difficulties. Given these limitations, it seems reasonable to explore alternative means of assessing the credibility of the evidence's underlying claims about the effectiveness of needle exchange programs. Before doing so, however, it is useful to examine how previous research reviews have attempted to incorporate the traditional emphasis on design-induced control.
Two reviews were commissioned by the federal government and published in 1993: one by the U.S. General Accounting Office (1993) and one by the University of California at San Francisco (Lurie et al., 1993). Prior to 1993, a number of other studies were published (Des Jarlais et al., 1985; Stimson et al., 1988).
A close examination of the manner in which these studies were conducted strongly suggests their reliance on the quality of the evidence in individual studies, which is based on the strength of their research designs. The language of the assessments also reflects the expectation that, when they are taken as a collective across studies, even though the designs are less than ideal, the preponderance of evidence will weigh in favor of or against a definitive conclusion about needle exchange programs.
Taken together, these studies tend to suggest that needle exchange programs are either neutral or positive in terms of potential positive effects and that they do not demonstrate any potential negative effects. However, each study's conclusions are often less than firm because of its methodological limitations.
When the designs of a group of studies are limited, little inferential clarity is gained by looking at the preponderance of evidence, even if it