A few notable studies have attempted to evaluate misreporting in broad-based, representative samples. However, lacking direct evidence on misreporting, these studies have to rely on strong, unverifiable assumptions to infer validity rates. Biemer and Witt (1996), for instance, analyze misreporting in the NHSDA under the assumptions that (1) smoking tobacco is positively related to illegal drug use and (2) the inaccurate reporting rate is the same for both smokers and nonsmokers. Under these assumptions, they find false negative rates (defined here as the fraction of users who claim to have abstained) in the NHSDA that vary between 0 and 9 percent. Fendrich and Vaughn (1994) evaluate denial rates using panel data on illegal drug use from the National Longitudinal Survey of Youth (NLSY), a nationally representative sample of individuals who were ages 14–21 in the base year of 1979. Of the respondents to the 1984 survey who claimed to have ever used cocaine, nearly 20 percent denied use and 40 percent reported less frequent lifetime use in the 1988 follow-up. Likewise, of those claiming to have ever used marijuana in 1984, 12 percent later denied use and just over 30 percent report less lifetime use. These logical inconsistencies in the data are informative about validity if the original 1984 responses are correct.

These papers make important contributions to the literature. In particular, they illustrate the types of models and assumptions that are required to identify the extent of misreporting in the surveys.12 Still, the conclusions are based on unsubstantiated assumptions. Arguably, smokers and nonsmokers may have different reactions to stigma and thus may respond differently to questions about illicit behavior. Arguably, the self-reports in the 1984 NLSY are not all valid.

To evaluate the fraction of users in the population in light of inaccurate response, I begin by imposing the following assumptions on misreporting rates:

IR-1: In any period t, no more than P percent of the self-reports are invalid. That is, P[wt=0, yt=1]+P[wt=1, yt=0]=P.

IR-2: The fraction of false negative reports exceeds the fraction of false positive reports. That is, P[wt=0, yt=1]=P[wt=1, yt=0].

These assumptions imply that the prevalence rate is bounded as follows:

P[wt=1]=P[yt=1]=min{P[wt=1]+P, 1}.


The upper bound follows from IR-1, whereas the lower bound follows


Biemer and Witt (1996) explicitly note this point when they state that their “objective is to investigate some capabilities and limitations of the methodology and demonstrate its use for surveys such as the NHSDA.”

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement