Sample-Based Health Care Data

Some datasets provide health care data derived from sampling methods. These datasets are not universal; rather, they are based on selected numbers of events. Depending on the sampling approach used, total prevalence or incidence estimates can be generated.

NCHS oversees three national sampling surveys of health care utilization that contain data relevant to poisoning injuries. These are:

  • National Hospital Discharge Survey (a national sample of hospital data);

  • National Hospital Ambulatory Medical Care Survey (a national sample of hospital-based emergency departments and ambulatory care centers); and

  • National Ambulatory Medical Care Survey (a national sample of outpatient visits).

In addition to NCHS’s National Hospital Discharge Survey, the Agency for Healthcare Research and Quality (AHRQ) oversees its own hospitalization survey, the Health Care Cost and Utilization Program National Inpatient Sample. Because these are all designed as representative, weighted samples, each survey can yield national estimates of health care utilization. The datasets include ICD-9 condition codes and are available electronically for downloading free of charge (free query of the dataset without downloading in the case of AHRQ). Because these survey data are collected annually with consistent sampling methods, they allow data merging across years as well as surveillance tracking of trends over time. Several of these datasets are particularly relevant to developing estimates of direct health care cost. Despite their potential as a rich surveillance data source, relatively few peer-reviewed research publications have exploited these surveys for poisoning and drug overdose surveillance purposes (Klein-Schwartz and Smith, 1997; McCaig and Burt, 1999; Powell and Tanz, 2002; Rodriguez and Sattin, 1987).

The datasets face the same ICD-9 coding limitations discussed in relation to death certificate national vital statistics data. Moreover, because they are based on samples, uncommon events may be undetected or have few sampled observations with a wide margin of statistical error. Combining survey years can sometimes, but not always, address this shortcoming. Because of the sampling design, estimates for discrete geographic areas (e.g., at the state level) usually cannot be generated from these surveys.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement