Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 47
5
Strengths and Weaknesses of Health
Insurance Data Systems for
Assessing Outcomes
LESLIE L. ROOS, NORALOU P. ROOS, ELLIOTT S. FISHER, and
THOMAS A. BUBOLZ
Health care data bases of varying scope and quality exist in a number of dif-
ferent settings: research groups, hospitals, insurers, and governmental agencies.
Of particular interest are the data generated by health insurance systems in
Norm America, Europe, Australia, and New Zealand. Because health care data
collected for administrative purposes are evermore available and less expensive
to analyze, it is not surprising that such data bases are increasingly used in tech-
nology assessment and health policy research (1,2,3~. Moreover, their use is
explicitly advocated in the Patient Outcomes Research Team approach, estab-
lished by the Agency for Health Care Policy and Research.
What kind of information from administrative data bases is useful for clinical
analyses? Many American data bases, such as Medicare, commonly provide He
following data from hospital discharge abstracts:
*Some of the material in this paper has appeared in: Roos LL, Sharp SM, Cohen
MM, Wajda A. Risk adjustment in claims-based research: The search for efficient
approaches. Journal of Clinical Epidemiology 1989;42:1193-1206; and Roos LL.
Nonexperimental data systems in surgery. International Journal of Technology
Assessment in Health Care 1989;5:341-386; and Rutkow IM (ed). Socioeconomics of
Surgery. St. Louis: C.V. Mosby, 1989.
This paper was supported by the Institute of Medicine, by Career Scientist Awards
from Health and Welfare, Canada (to Leslie L. Roos and Noralou P. Roos), and by grants
from Health and Welfare, Canada (6607-1197-44) and from the National Center for
Health Services Research (5 R18 HS-05745~.
47
OCR for page 48
48
Patient-identifying information:
· Date of Birth
· Sex
· Place of Residence
· Identifying Number (individual or family)
Other items for analysis include:
Discharge Diagnoses (several)
Procedures Performed in Hospital (several)
Hospital
Date of Admission
Date of Discharge
Discharge Code (death, another hospital, home, etc.)
Secondary items include:
· Admitting Physician Identifying Number
· Physician Performing Each Procedure (identifying number)
LESLIE L. ROOS ET AL.
Physician claims typically identify the patient, the service rendered, date of the
service, and the physician. The major data bases are designed to describe
patient characteristics, diagnoses, and treatments. One reason for incomplete-
ness of data is that hospitals lack motivation to record information that does not
have an immediate impact on reimbursement. An ideal data base would have
the following characteristics:
· System-wide coverage of an entire population. Government-organized
insurance systems are typically individual-based. Such coverage includes care
received at a wide variety of institutions and from the whole universe of health
care providers. Coverage of an entire population permits study of utilization
from an epidemiologic perspective, attributing use to individuals according to
place of residence, no matter where the services are provided. Subgroups or
whole populations can be compared to see how much of any given resource is
used. Such population-based data can be adjusted for age, sex, and other char-
acteristics to facilitate comparisons.
· Unique identifying number (or combination of identifiers). When each
person is identified in this manner, usage can be cumulated for each person,
wherever care is received. This data base should record all contacts with the
health care system for each individual, with the unique identifier available to
facilitate tracing. Ideally, the data base would record all hospital care, both
inpatient and outpatient, services in free-standing surgery centers, activities in
physician offices, entry to nursing or personal care home, health care received
at home, and prescription drug use. Thus, an individual having surgery in one
setting who is readmitted to a second institution will have both contacts cap-
tured by the system.
OCR for page 49
HEALTH INSURANCE DATA SYSTEMS
49
~ Enrollment or registryfile. A file specifying when and why each individ-
ual's coverage begins and ends is very useful. Such a file is necessary to tell
whether an individual with no recorded contact with the health care system
resided in the jurisdiction and indeed had no contact; left the jurisdiction; or
died. This type of file helps to determine the percentage of individuals enjoying
intervention-free survival survival without any contact with the health care
system.
· Comprehensiveness. Data bases can be characterized by their comprehen-
siveness. Some aspects of comprehensiveness can determine the design of any
study, from relatively simple to relatively complex. At the simplest level of
administrative data bases (Level 3), only hospital discharge abstracts are needed
(4~. Level 3 data can support studies of length of stay and in-hospital mortality;
when combined with coverage of a population, such information permits analy-
ses of utilization across medical market areas. At the intermediate level, Level
2 data require consistent individual identifiers on hospital discharge abstracts.
Hospital claims can be sorted by date and identifying number to generate hospi-
talization histories for each individual. Level 2 data can thus be used for short-
term outcome studies of readmissions and complications after surgery. Such
research on quality assurance and cost control can provide timely feedback to
health care institutions. The most comprehensive Level 1 data bases possess all
the features of the Level 2 and 3 files and include an enrollment file with dates
for startup, death, and leaving the insurance plan. Longitudinal studies can fol-
low individuals' health care utilization through time (see Table 5.1~.
A Level 1 system offering complete coverage for a population can often pro-
vide large samples and impressive follow-up capabilities, whether the care be
ambulatory, community, or hospital based. The proportion of individuals enjoy-
ing intervention-free survival can also be ascertained. The ability to develop
individual longitudinal histories (before and after an event or index hospitaliza-
tion) permits identifying first-time occurrences in a population. These incident
cases present a more homogeneous group for study; a second operation or
recurrence of a condition can be distinguished from new events. Alternative
treatments and different hospitals can be compared and analyses carried out
across medical market areas on a per-person basis.
STRENGTHS
System-Wide, Population-Based Data
System-wide coverage allows us to monitor the effectiveness of clinical
treatments. Since administrative data bases are not limited to specific institu-
tions, Hey include poor health outcomes which occur following discharge from
an institution. This makes possible comparative studies of outcomes from insti-
tutions with very different lengths of stay. Because administrative data bases
OCR for page 50
so
LESLIE L. ROOS ET AL.
TABLE 5.1 Data requirements and types of studies using hospital data
Data Requirements
Types of Studies
Simple—Level 3
Need hospital discharge
abstracts
Intermediate Level 2
Need hospital discharge
abstracts and consistent
individual identifiers
Comprehensive—Level 1
Need hospital discharge
abstracts, consistent
individual identifiers,
and enrollment file
In-hospital Mortality
Volume-outcome comparisons,
monitoring of individual hospitals
Length of Stay
Small-Area Analyses
Changes over Time
Timely Longitudinal Research
Short-Term Readmissions
Volume-Outcome Comparisons
Monitoring of Individual Hospitals
Quality Assurance and Cost Control
Highest Quality Longitudinal Research
Shortest-Term and Long-Term
Outcome Studies
Identification of Incident Cases
Volume-Outcome Comparisons
Monitoring of Individual Hospitals
Choice of Treatment
Small-Area Analysis by Person
SOURCE: Rutkow IM (ed), Socioeconomics of Surgery. St. Louis: C. V. Mosby, 1989.
cover care received by multiple providers, complications which might not be
picked up in any individual practice can be detected. For example, almost half
(42.6 percent) the Manitoba surgeons performing repeat resections were not the
physicians who had performed the original prostatectomies (S). Patients may
not return to a physician if they are dissatisfied or have poor outcomes on a
treatment he or she has prescribed; without system-wide follow-up, physicians
may overestimate the positive aspects of their treatment.
Efficacy versus Effectiveness
. · . · . . .
This problem can be stated simply: treatments that produce excellent out-
comes in a research setting (efficacious) may not be beneficial (effective) when
applied to a different spectrum of patients in clinical situations. Community
hospital practices and medical care outcomes may differ widely from those pub-
licized by researchers at academic centers.
Research on efficacy of procedures or the results of the so-called "best" situ-
ation (generally a teaching hospital) are usually reported in studies of technolo-
gy assessment (6~. But technology assessment is not well developed; a lack of
OCR for page 51
HEALTH INSURANCE DATA SYSTEMS
51
information on the efficacy of many procedures (7) may make physicians uncer-
tain about choice of treatment (81.
The rarity of a condition (9) almost always presents problems in assessing
efficacy by randomized clinical trials. Non-experimental research may show
clinical trials to be "so difficult to organize or so costly as to be impractical"
(101. Even when clinical trials have been performed, non-experimental data
bases can play a valuable role. For example, population-oriented data bases
facilitate long-term follow-up of clinical trials. Claims research can also help
specify the relevance of clinical trials. If clinical trials have too stringent crite-
ria for entry, actual physician practice may be so different that the results are
only partially applicable.
Evaluation of quality of care in both types of settings can be made easier by
population-based data, since studies of efficacy and effectiveness can use the
same non-experimental data systems. Effectiveness studies that present out-
come results from representative samples of all hospitals and all physicians are
rare. Administrative data can be particularly valuable for such research.
Large Numbers and Time Series
An added benefit of the system-wide coverage characteristic of administra-
tive data bases is the large numbers of cases and controls which can typically be
identified. Administrative data bases help expand the number and type of out-
comes traced. In other words, if the mortality rate is too low to permit a statisti-
cally strong analysis, we can study additional poor outcomes, including
complications reflected in hospital readmissions and patterns of physician visits.
If there are insufficient cases in a given year, additional years can be exam-
ined. Thus, a population can be tracked over a longer period to accumulate
enough events to permit analyses. This is especially useful with rare conditions,
such as infective endocarditis (111.
Ongoing health insurance systems add a new set of observations every year.
Potentially, analysts can go back to the beginning to find those items of infor-
mation which are routinely recorded. These long series of data allow retrospec-
tive cohort studies. For example, in 1989 a researcher can go back to surgery
cases recorded in 1979 and do a 10-year follow-up.
Long-term studies of health outcomes can give very different assessments of
the efficacy of a given procedure. A workshop convened by the National
Institutes of Health (12) suggested that after transurethral prostatectomy, "the
need for further operative treatment is uncommon"; however, the cumulative
eight-year probability of having a second operation was recently found to be
20.2 percent (131.
Administrative data can also be used before an event of interest to define
incident cases. A study of infective endocarditis listed all Manitoba patients
hospitalized with the condition from April 1, 1979, to March 31, 1985. Then
OCR for page 52
52
LESLIE L. ROOS ET AL.
"incident cases were identified by eliminating those individuals with previously
diagnosed infective endocarditis in He April 1, 1976 - March 31, 1979 period" (11).
In similar fashion, data on histories can help create clean comparison groups.
For example, in a study of whether tubal ligation increases a woman's risk of
having a hysterectomy, Cohen (14) identified as her control group a random
sample of women aged 25 to 44 and eliminated all individuals who had a hys-
terectomy prior to July 1974 or tubal ligation from 1970 through 1982.
The time series characteristics of data bases can also be used to characterize
individuals by health care usage/morbidity patterns to develop measures of
case-mix adjustment. This application of administrative data bases is treated in
detail in the section on risk adjustment.
Events Unaffected by Recall
We know that patient reports of drug exposure, hospitalizations, physician
visits, and medical conditions are subject to recall biases. Ray and Griffin (15)
note that a primary "strength of Medicaid data for pharmacoepidemiology is the
availability of detailed pharmacy records from which drug exposure history can
be constructed." Most evidence suggests that events such as hospitalizations
and physician visits are well recorded in health insurance systems. As dis-
cussed later, diagnoses recorded in the administrative data have limitations,
which are often related to characteristics of medical practice; two physicians
seeing the same patient will sometimes diagnose different entities. Overall,
diagnoses recorded in the claims system are physician-originated and likely as
accurate as patient self-reports.
Accurate recording of past health events is critical in developing lifetime
estimates of an exposure (such as x-ray usage) or when timing of an event is
important. Thus, in assessing the effectiveness of influenza vaccine, it is impor-
tant to know whether the vaccine was delivered and if it was given during the
appropriate period.
Unobtrusive Nature
A great advantage of administrative data is that they permit relatively unob-
trusive research. Because studies using these data are done as statistical analy-
ses, patient consent is not sought. There are no biases because persons refuse to
participate or because patients, providers, or data collectors know about the
study. This is important. The biases that arise when subjects know to which
group they have been randomized, or even when participants know they are
involved in a study, have been discussed elsewhere (16~. Hertzman (17), for
example, has shown that information on health status from an occupational
group explicitly under study may differ from that obtained from a population
unaware of the purpose of the study.
OCR for page 53
HEALTH INSURANCE DATA SYSTEMS
53
Multiple Comparisons
The fact that individuals are not randomly assigned to comparison groups
raises questions as to the comparability of individuals and hospitals being stud-
ied. Such questions arise regardless of the method of risk adjustment admin-
istrative data, chart review, physical examination, etc. Administrative data gen-
erally give several ways to test the consistency of findings after risk adjustment.
Hypotheses often can be tested among a number of subgroups in the popula-
tion. In a recent study of prostatectomy, N. Roos et al. (2) found higher mortali-
ty among men having transurethral prostatectomies (the more accepted opera-
tion) than among those having open prostatectomies (the older operation). The
risk-adjusted results from one Manitoba teaching hospital held for all men hav-
ing prostatectomies and for a subgroup of the healthiest men. Testing across
populations is also helpful. Comparisons using administrative data from four
countries confirmed the findings of differential mortality after transurethral and
open prostatectomies.
Statistical models can also be compared. When several covariates are avail-
able, a number of regression models can be tested for consistency. If relative
risk of mortality or another dependent variable does not change as covariates
are entered or deleted, faith in the findings is increased (18,19~.
Design Flexibility
Researchers designing cohort and case-control studies must deal with critics,
friendly and otherwise, who suggest changes in the design of their study. One
great advantage of administrative data is design flexibility, the ability to alter a
research design with little difficulty. For example, changes in definition of
exposure may involve: (a) altering the time period during which an intervention
(such as immunization for flu) is seen to be relevant, and (b) redefining control
groups to make them parallel with the group receiving a treatment. The vari-
ables used for matching purposes can be easily changed. Finally, several
designs can be used with the same administrative data base. For example, a
planned study of the efficacy of influenza vaccination will utilize both cohort
and case-control designs constructed from the Manitoba data base. Similar
design flexibility should be possible using the Medicare data.
Potential for Multiple Projects
Because administrative data systems are not designed for specific studies,
they can be valuable for multiple projects. For example, a set of files originally
developed to examine the short-term outcomes associated with cholecystectomy
were subsequently used in a study to develop computerized methods for moni-
toring readmissions following surgery, changes over time in quality of care over
OCR for page 54
54
LEST 1E L. ROOS ET AL.
a 10-year period, quality of care, and health care outcomes in the native and
non-native communities. More recently, the literature suggests there may be an
elevated risk of heart disease following cholecystectomy; these files will again
be reassessed.
The flip side of having a data set with the potential for multiple analyses is to
be accused of being theoretical and opportunistic in one's research. Because
health care data bases are not closely restricted in subject matter and because
there are limits to the type of data they make available, studies should be tai-
lored to their strengths. For example, coding systems do not identify laterality,
so studying outcomes of procedures which can be performed only on one part of
the body (prostatectomy) is much easier than studying conditions where a sec-
ond procedure will not necessarily represent a complication but may be a new
event (total hip replacement or cataracts).
Focus on Risks
Administrative data banks, by their focus on health care interventions, make
possible more accurate assessment of risks associated with treatment (mortality;
readmissions; specific sequelae such as prostate revisions, stricture dilations,
etc.~. Given the uncertainties surrounding major areas of medical treatment such
as bypass surgery and carotid enda~terectomy, it might be appropriate to con-
centrate on comparing the risks associated with new and established treatments
until firm data on the benefits of medical treatment are developed.
Clinical Decision Making
Models of the clinical decision process must present choices the way clini-
cians do. These decisions may or may not require specific test results. Thus,
the adequacy of administrative data for decision models depends on the condi-
tion and the procedure studied. Successful modeling has been carried out for
medical versus surgical treatment of infective endocarditis (11) and for watchful
waiting versus surgery for prostate disease (20~.
The decision tree for modeling treatment of infective endocarditis highlights
the usefulness of claims data. The data base provided estimates of the probabil-
ities of a number of events after two strategies: early surgery or attempted med-
ical cure. Variables used in the decision tree included probabilities of operative
mortality, probabilities of events (including dying and congestive heart failure)
before or soon after four weeks of antibiotic therapy, probabilities of events
occurring long after completion of antibiotic therapy, and life expectancies in
weeks under different treatment regimes.
OCR for page 55
HEALTH INSURANCE DATA SYSTEMS
55
WEAKNESSES
Structural Limitations
Administrative data sets generally allow the collection of a fixed amount of
information on all events for all people covered by an insurance program.
These data sets are designed to answer such questions as who receives treat-
ment, when was the treatment given, where was the treatment given, who gave
the treatment, what was the treatment, and how much did it cost.
Administrative data typically have structural limitations inherent in the
record layout, available codes, and coding regulations. Such limitations can be
overcome only through structural changes in either the record or the regulations.
Several coding issues are of interest to the researcher. A single surgical proce-
dure or hospitalization may result in a single hospital record and one or more
physician bills. Linkage of the surgeon's claim to the hospital claim has been
shown to be an excellent way to check on the reliability of the coding of hospital
procedures.
At the same time, for many procedures and diagnoses, different codes may
plausibly be used to describe the same event. One physician or insurance carrier
may prefer a given code, while others use different codes. Because several
physicians often submit bills for treating the same condition in the same patient
(surgeon, assistant, anesthetist), there is a real possibility that different codes
will be used. However, multiple bills for the same event offer another way to
confirm the occurrence of events or test the reliability of the initial claim.
The precision of codes varies across conditions. For many conditions, the
ICD-9-CM hospital codes, used in the Medicare and Manitoba data bases, and
tariff codes, such as CPT, are highly precise in their specification of the proce-
dure and the clinical problem. Examples include transurethral prostatectomy
and carotid endarterectomy. Studying these conditions or treatments through He
claims data is relatively straightforward.
Sometimes the tariff codes are more precise than the ICD-9-CM codes; this
seems to be the case with hip repair procedures. Other procedures may be poor-
ly classified on hospital and physician claims and may be more problematic to
study; vascular surgery presents difficulties in this regard. Diagnoses generally
are less precise than most researchers would prefer; "congestive heart failure"
and "diabetes mellitus" encompass broad ranges of severity that may mask
important clinical subgroups.
The detail of the coding conventions may be inadequate for some studies.
The ICD-9-CM coding system does not distinguish procedures performed on the
left side of the body from those done on the right side. This makes it more diffi-
cult to assess the results of orthopedic surgery; a second hip or knee replacement
operation on someone who has already had one may mean either a reoperation
or an operation on the other extremity (21~.
OCR for page 56
56
LESLIE L. ROOS ET AL.
Moreover, the data captured by administrative data systems may not be those
of most interest to health outcome studies. While the data system may record
the occurrence of certain events (laboratory tests, x-rays, pap smears, etc.), the
results of these tests typically are not available in an administrative data system.
In fact, before beginning a study with an administrative data bank, the key
question is: "Can the event of interest be defined in the system and are key out-
comes captured?" The answer depends on the specific recording systems used.
It may be several years before a new procedure, such as angioplasty, is accu-
rately recorded in this system.
Finally, the timing of diagnoses during a hospitalization cannot be deter-
mined from the discharge abstract alone. Consequently, conditions that develop
in the course of treatment cannot be distinguished from comorbid conditions
present at the time of admission an important distinction for risk-adjusted out-
come analyses (22,23~. For example, Medicare patients who develop a pul-
monary embolus after surgery cannot be reliably distinguished from those who
had the condition before the operation. Other data systems (such as that in
Manitoba) may be able to make this distinction by linking physician claims.
Ongoing work is directed toward estimating the probability that such conditions
will develop during surgical hospitalization for a number of procedures.
Bias Due to Reporting and Coding
There are several threats to the validity of claims data (24~. The data sub-
mission process and coding of the data can lead to reporting and coding errors.
However, financial incentives for providers to assure adequate reimbursement
and for funding agencies to minimize expenditures provide some protection
against lost or inaccurate data. Another source of bias is that contacts with the
system generating the data have to be initiated by someone, often the patient.
The probability of contact with the system may be affected by hospital and
physician supply.
The accuracy of procedural and diagnostic data depends upon both the
physicians and the clerks involved. American Medicare data appear to record
procedures performed with fair accuracy, particularly if the "order of proce-
dure" is ignored. Medicare data quality may have gone up since the introduc-
tion of the Prospective Payment System, but diagnostic information may not be
as accurate as in the Manitoba files (25,26,27~. Medicare data also do not
include outpatient information in the hospital file. In Manitoba, both surgical
procedures performed in hospitals and discrete billable items (even if not major
events, including tests such as pap smears) appear to be reliably captured in the
claims system.
The quality of diagnostic data also depends upon the source. Diagnoses on
OCR for page 57
HEALTH INSURANCE DATA SYSTEMS
57
hospital records are likely to be more accurate than diagnoses on claims gener-
ated by physician's visits. In Manitoba, diagnoses are noted with a reasonable
degree of accuracy and specificity in the hospital system, reflecting the profes-
sional training of medical records technicians. A comparison of diagnoses
recorded on hospital records with those reported in the claims showed 95 per-
cent correspondence in gallbladder disease, and 89 to 92 percent correspon-
dence in a study of acute myocardial infarction (28,29~.
Although Medicare does not include ambulatory care diagnoses with the
physician claims, other data systems may contain this diagnostic information.
Such diagnoses are useful at a more general level. One fruitful approach in
Manitoba has been to group diagnoses available from physician claims (for
example, contacts for gynecologic problems in a study of women undergoing
hysterectomy, and gallbladder disease and abdominal pain for a study of con-
tacts before and after gallbladder surgery) rather than to attempt fine diagnostic
distinctions (25,30~.
Bias Due to Differential Contact
As noted earlier, contact bias is a threat to the interpretation of claims data;
the individual rather than the researcher generally initiates contact with the sys-
tem generating the data. Thus, a person who is ill, but has no contact with the
health care system, does not produce a record on this episode of illness or
chronic condition. Such contact can be important for studying outcomes. For
example, Manitoba research has used readmission to hospital in the three
months after hysterectomy as an indication of post-surgical complications.
The probability of an individual contacting a physician or being hospitalized
varies with certain system characteristics (such as insurance coverage and sup-
ply factors), individual characteristics (care-seeking behavior), and physician
factors (propensity to hospitalize) (31~. Given universal insurance, relatively
few ill individuals lack contact with the health care system when the measure-
ment period is several years (32~. In the United States, however, co-payment is
likely to accentuate contact bias. Poorly covered individuals may be precisely
those who receive the poorest care; analyses thus may underestimate poor out-
comes.
Supply factors are important and readily studied. Assuming similar insur-
ance coverage for all members of a political unit, the supply of physicians and
hospital beds has been shown to affect system usage (33~. Supply variables
have been shown to be statistically significant in predicting such outcomes as
readmissions. Data on bed and physician supply per capita generally are fairly
easy to obtain for different geographic units. By controlling for these factors on
a small-area basis, analyses of readmissions and other utilization can continue
in a statistically sound manner.
OCR for page 58
58
LESLIE L. ROOS ET AL.
Benefits of Treatment
It is difficult to identify benefits of treatment in an administrative data sys-
tem. Estimates of quality of life are very indirect. Changes in the frequency of
diagnoses and hospitalization provide some information, and periods of inter-
vention-free survival following a key event can be calculated. These variables
may be unsatisfactory as a measure of real benefit of the procedure, although
some studies show substantively significant relationships between utilization
and morbidity (32,341.
CONTROVERSIAL AREAS
Risk Adjustment
Risk adjustment poses a major problem in evaluating outcomes across hospi-
tals and physicians (351. If patients operated upon at Hospital A have higher
mortality and complication rates than patients operated upon at Hospital B. is it
because Hospital A's operating team is less skilled? Or is it because the case-
mix of patients at the two institutions is different, with Hospital A treating high-
er-risk patients?
One issue with significant implications for studies of quality assurance and
cost control is: when can claims data alone be used for these controls and when
is prospective data collection necessary? What controls are good enough for
testing hypotheses about the relationship between surgical volume and treat-
ment outcomes, for distinguishing the better of two treatments, and for identify-
ing hospitals or physicians with particularly poor or especially good outcomes?
The issue of how much additional information is provided per unit of cost is
vital when expensive primary data collection is being considered. Researchers
have assumed that the optimal approach would incorporate primary data collec-
tion, possibly combining clinical judgment with physiologic information and
diagnostic testing (18,19,231. On the other hand, the ability of researchers and
clinicians to predict the morbidity and mortality following medical and surgical
treatment is clearly limited.
Figure 5.1 illustrates our view of the utility of information. The variation
explained is presented on the Y axis, while the X axis measures effort. The pre-
dictive power provided by better algorithms applied to a given data type reaches
a "flat of the curve" situation fairly quickly. Figure 5.1 suggests the greater pre-
dictive power of the first covariates in a multivariate analysis. If primary data
are collected, they may well be among the best predictors (361. But when sev-
eral measures are available, they are largely substitutable for each other.
One promising taxonomy for comorbidity takes into account not only the
number but also the seriousness of comorbid diseases. The comorbidity index
of Charlson et al. (37) explained a higher proportion of the variance in one-year
OCR for page 59
HEALTH INSURANCE DATA SYSTEMS
-1
C:)
IL
o
In
In
o
>
CD
go
Oh
G
o
C)
C)
LIJ
Q
59
Asymptote for claims, prospective data and lab tests
-
Asymptote for claims and prospective data
///
it//
Asymptote for claims data alone
7
Asymptotes will vary according to conditions and procedures involved.
AMOUNT OF EFFORT INVOLVED IN PREDICTING OUTCOMES ~
FIGURE 5.1 Analytical effort involved to produce results for different types of data. Asymptotes
will vary according to conditions and procedures involved. SOURCE: Roos LL, Sharp SM, Cohen
MM, Wajda A. Risk adjustment in clams-based research: The search for efficient approaches.
Journal of Clinical Epidemiology 1989;42:1193-1206.
survival rates than a model based solely on the number of comorbid diseases.
In a test population with a large set of clinical and demographic variables, age
and the comorbidity index were found to be the only significant predictors of
death attributable to comorbid disease. This index has been used in a number of
claims-based studies (2,18,191.
Computerized hospital admission/separation abstracts can be used to gener-
ate covariates, such as the Charlson comorbidity index, for risk adjustment. In
assessments done in Manitoba, the addition of other sorts of information (claims
from physician visits, health status indices from surveys, and even some
prospectively collected clinical data) generated little additional power in pre-
dicting hospitalization, nursing home entry, and mortality (19,38~.
Manitoba Level 3 data (from the surgical event alone) using age, sex, and
limited comorbidity information have provided almost as good risk adjustment
OCR for page 60
60
LESLIE L. ROOS ET AL.
in predicting mortality and post-surgical readmissions as Level 1 data (from the
history of hospitalizations in the preceding six months and the surgical event).
A model using only prognostic data (comorbidity inflation from the comput-
erized history preceding surgery) also resulted in fairly good risk adjustment
and similar overall results. Thus, Blumberg's (22) concerns about using infor-
mation from the index hospitalization, rather than prognostic data, do not seem
critical.
Considerable progress in adjusting for risk by chart review has also been
made. Daley et al. (23) have built upon the APACHE II system to develop a
chart-based clinical risk adjustment system, the Medicare Mortality Predictor
System, to predict hospital mortality. However, when researchers using inex-
pensive nonintrusive measures such as claims must decide whether to invest
scarce resources in more data collection, they must evaluate the likely yield of
the additional information (391. It is difficult to find the proper point or points
between "gold standard" technology assessment research that relies on exten-
sive primary data collection and somewhat less accurate but cheaper and more
timely approaches. We need research to compare the power of additional chart
review with claims-based work. Direct comparisons of predictive power and
biases would define whether widespread additional data gathering is cost effec-
tive in risk adjustment.
If cross-sectional data can accurately identify patients at different degrees of
risk, large-scale studies of in-hospital mortality following surgery become rela-
tively easy to conduct. The literature comparing outcomes across institutions is
buttressed by research supporting the validity of controls generated by cross-
sectional data (40,411. Claims-based research certainly suggests that useful gen-
eral covariates can be produced; different covariates need not necessarily be
generated for each treatment or condition studied (19,361.
Outcome Measures
Some outcome measures require labor-intensive data collection through
patient interviews or hospital records review. On the other hand, administrative
data, such as insurance claims, provide an excellent source for nonintrusive
measures such as readmissions and mortality. Because many data bases are
maintained and updated for administrative purposes, analyses can be done for a
relatively small marginal cost.
Most of our knowledge about variation in outcomes is derived from studies
using nonintrusive measures. Such measures can be particularly valuable in
screening large data bases "to flag events and caregivers with suspect profiles
of performance" (421. Death is easily documented, usually from multiple
sources such as death certificates, hospital reports, and insurance claims.
However, as mortality rates decline, the number of deaths, particularly follow-
ing single procedures or treatments, becomes very small. Thus, the study of
OCR for page 61
HEALTH INSURANCE DATA SYSTEMS
61
non-fatal events (morbidity) and effects on quality of life has become more
important in recent years. "Intervention-free survival" has been useful for
studying surgical outcomes, and claims data might also be used to measure
remission-free years for chronic diseases. Other nonintrusive measures based
on claims data are important here:
1. Short-term readmission to hospital, within a specified period after
surgery and for post-surgical complications. Building on previous work (43),
panels of specialists, meeting under the auspices of the Health Care Financing
Administration (HFCA), have developed lists of reasons for readmission,
which indicate possible complications after a number of common procedures;
2. Additional surgery after the initial operation;
3. Long-term problems leading to hospital readmission, such as myocardial
infarction and stroke; and
4. Subsequent physician visits with diagnoses indicating continuing prob-
lems.
Survey measures have been widely used. Their strength is the information
they provide on attitudes, feelings, and tradeoffs; their weakness has been the
cost of data collection (44~. Self-perceived health, ability to perform activities
of daily living, and ability to live independently in the community also are
important for assessing health status. Finally, outcome studies focusing on
providers generally emphasize patient satisfaction and physician performance
standards.
EXPANDING DATA BASES
Record Linkage
Record linkage the combining of separate records of the same individu-
al is a powerful new research tool. Linking specialized data bases with multi-
purpose claims data presents many research options, greatly increasing the
amount and quality of data on individuals. Such capabilities are important
because, no matter how much is recorded in any data base, specific items
desired for a given study may not be available. Linkage can help make clini-
cians more comfortable with using administrative data; an expanded amount of
information can provide many of the details clinicians associate with the prac-
tice of medicine. Record linkage helps deal with questions like: Does a given
data set have enough detail to support research on efficacy and effectiveness?
Are the data accurate and complete enough, and suitable for the purposes to
which they are put?
Additional information may be contained in other sources which permit
linkage to an existing data base. In particular, administrative data bases often
do not include certain tests or x-rays if they are not billable, and the results of
OCR for page 62
62
LESLIE L. ROOS ET AL.
tests frequently are not included. Information on medical treatment (such as
drugs used) typically is not available, making it difficult to compare medical
and surgical alternatives for treatment of many conditions.
Although linkages involving Medicare claims typically use Social Security
number, record linkage may involve files where these numbers, as well as name
and address, are not available. Record linkage depends on having a sufficient
number of identifiers of adequate power. Some relevant applications of record
linkage are listed; the previously mentioned prostatectomy research used the
first four linkages to help the Manitoba data base reach its potential (21:
1. Linkage of enrollment files or registries with Vital Statistics files to verify
deaths and provide cause-of-death information. Given appropriate confidential-
ity safeguards, both Canadian and American governments cooperate with
requests for death matching. These linkages underlie several longitudinal stud-
ies using Canadian and/or American data (451.
2. Linkage of claims with independently collected data from cancer reg-
istries to provide higher-quality information on the occurrence and date of diag-
nosis of cancer, thereby facilitating better case-mix controls, validity checks,
and the potential for important independent studies (461.
3. Linkage of hospital and Vital Statistics information with preoperative data
collected by one hospital's Anesthesia Quality Assurance Program produced a
very rich data set on preoperative status of patients and operative outcomes
(471. These data can help assess the efficacy of a number of surgical procedures
by providing covariates (particularly the widely used American Society of
Anesthesiologists' Physical Status score) to increase the credibility of claims-
based analyses.
4. Linkage of hospital claims with physician claims to verify fact and date of
surgery. These methods have supported extensive quality checks in Manitoba
and are also being used with American Medicare data.
5. Linkage of survey information and claims to provide a fuller picture of
the relationships among functional status, self-reported health status, and surgi-
cal outcomes (38~. In Manitoba, linkage of two surveys of the aged may permit
incorporation of the data into studies of procedures frequently done on the
elderly.
Although the specific linkage keys differ in each example, the expanded files
have supported a diverse set of studies. These types of linkage dramatically
increase the amount and quality of individual-level data. Such an approach
helps connect the perspective of the clinical epidemiologist and that of the
health services researcher. Specialized data bases can be combined as appropri-
ate with multipurpose claims data. Claims and detailed data from other sources
can be put side by side to better understand the strengths and weaknesses of
each.
OCR for page 63
HEALTH INSURANCE DATA SYSTEMS
63
Record linkage is a very valuable capability for researchers using non-exper-
imental data. The mathematical concepts may be unfamiliar initially, but intro-
ductory texts and user-friendly software facilitate record linkage (45,48~. A
considerable amount of literature examines long-term mortality due to particular
occupational health risks and provides examples of linkage studies in another
context (45,49~.
Primary Data Collection
What role does primary data collection play in claims-based research? We
can specify cases which need futher checking when individual identifiers are
available in administrative data sets. One purpose of primary data collection is
to add detail on diagnosis or procedure. The importance of this added detail
depends on the condition and procedure studied. For example, we may want to
know the number of diseased vessels for research on coronary artery disease.
We need information on laterality for studies of hip fractures; one needs to
know if a second operation resulted from a complication or was a new proce-
dure.
Primary data collection, particularly chart review, can also be used to con-
firm and buttress results obtained from analysis of administrative data bases.
Such work can increase the clinical credibility of studies based on claims; for
example, Malenka et al. (18) have reviewed Manitoba prostatectomies from one
teaching hospital, generating comorbidity indices by independent chart review.
The results, comparing outcomes of transurethral versus open prostatectomies,
were similar to those produced from claims analyses (2~.
Studies whose primary focus is collection of new information may still
depend on claims data to identify patients or providers and to trace outcomes.
Thus, the monitoring of hospital mortality, as done by HCFA, can help select
hospitals for primary data collection. Primary data collection within the hospi-
tal can be facilitated by claims data which identify individuals, by name or
number, whose charts should be pulled (18~.
A fruitful way to combine methods is to use administrative data to identify
individuals with a surgical treatment of interest; interviews could then examine
satisfaction, subjective health status, quality of life, and so forth. Not only can
claims data be used to identify specific cases but the linked data set can also
generate information on outcomes (18~. Similarly, studies of the appropriate-
ness of care (50,51) might find it valuable to trace outcomes using enrollment
files and claims data.
Combining administrative data and clinical data bases can compensate for
weaknesses in claims data. For example, a proposed study of angina has isolat-
ed several problems with the claims and suggested ways to deal with these diff~-
culties (see table on next page):
OCR for page 64
64
Limitations of Claims
Difficulty in distinguishing
between stable and unstable
. . .
angina using coding on
hospital claims (discharge
abstracts).
In-hospital investigations will
not generally appear on
discharge abstracts.
Information on some risk
factors (smoking) and
treatments (medical therapy)
not available.
LESLIE L. ROOS ET AL.
Ways to Handle
Linkage between hospital
claims and more detailed
clinical data will permit
sensitivity testing of the
importance of the stable
versus unstable distinction.
Many tests are billable and
will appear on physician
claims. Chart review may be
necessary to identify the
others.
This information can be
obtained from clinical data
base.
Several valuable data bases obtained by extensive chart review are available
for exploring what can and cannot be done using Medicare data. The largest
linked Medicare data set seems to be that supplemented with data on Key
Clinical Findings from eight Peer Review Organizations in seven states. As
described elsewhere (52), the data were obtained from the medical record by a
modification of the MedisGroups abstraction technique. Reviewers scan the
record of the hospitalization and encode abnormalities in admission symptoms,
history, the results of preadmission tests if documented in the medical record,
physical examinations (including vital signs), and laboratory and specialized
diagnostic tests. An extensive array of ICD-9-CM diagnostic codes (up to 30)
and procedure codes (up to 36) is also recorded, as are untoward events in the
course of the hospitalization.
DISCUSSION
Administrative data are rich in information that researchers should learn to
use effectively. Research has generated questions about specific issues, such as
the use of claims data to study medical treatments. Other issues are organiza-
tional and technical. Because outcomes research is interdisciplinary, we must
develop ways to facilitate research across centers. Because it takes consider-
able cost and effort to organize administrative data for research purposes, we
also need efficient information management.
Other questions relate to data needs: What constitutes clinically relevant
information on claims data? What auxiliary information should be collected?
OCR for page 65
HEALTH INSURANCE DATA SYSTEMS
65
Technical questions include: How good are the linkages Hat tie heals care data
from different sources? How should individual records be organized? How
should cleaning and checking be canned out? Current collaborations among a
number of centers and researchers are posing and answering such questions.
REFERENCES
Jencks SF, Kay T. Do frail, disabled, poor, and very old Medicare beneficiaries
have higher hospital charges? Journal of the American Medical Association
1987;257:198-202.
2. Roos NP, Wennberg JE, Malenka D, McPherson K, Anderson T. Cohen MM,
Ramsey E. Mortality and reoperation following open and transurethral resection of
the prostate for benign prostatic hypertrophy. New England Journal of Medicine
1989;320: 112~1124.
3. Roper WL, Winkenwerder W. Hackbarth GM, Krakauer H. Effectiveness in health
care: An initiative to evaluate and improve medical practice. New England Journal
of Medicine 1988;319: 1197-1202.
4. Roos LL, Roos NP. Using large data bases for research on surgery. In Rutkow IM
(ed). Socioeconomics of Surgery. St. Louis: C.V. Mosby, 1989:259-275.
5. Roos NP, Ramsey E. A population-based study of prostatectomy: Long term out-
comes associated with differing surgical approaches. Journal of Urology
1987;137:1184-1188.
6. Brook RH, Lohr KN. Efficacy, effectiveness, variations, and quality: Boundary-
crossing research. Medical Care 1985;23:710-722.
7. Patricelli RE. Employers as managers of risk, cost, and quality. Health Affairs
1987;6:75-81.
8. Wennberg JE. Improving the medical decision-making process. Health Affairs
1988;7:99-106.
Peto R. What treatments for rheumatoid arthritis can best be assessed by large, sim-
ple, long-term trials? British Journal of Rheumatology 1983;22:3~.
10. Wennberg JE, Mulley AG, Hanley D, Timothy RP, Fowler FJ, Roos NP, Barry MJ,
McPherson K, Greenberg ER, Soule D, Bubolz T. Fisher E, Malenka D. An assess-
ment of prostatectomy for benign urinary tract obstruction: Geographic variations
and the evaluation of medical care outcomes. Journal of the American Medical
Association 1988;259:3027-3030.
Abrams HE, Detsky AS, Roos LL, Wajda A. Is there a role for surgery in the acute
management of infective endocarditis? A decision analysis and medical database
approach. Medical Decision Making 1988;8:165-174.
12. Grayhack IT, Sadlowski RW. Results of surgical treatment of benign prostatic hyper-
plasia. In Grayhack, Wilson, Scherbenske (eds). Benign Prostatic Hyperplasia,
DHEW Publication No. NIH 76-1113, 1975:125-134. A workshop sponsored by
the Kidney Disease and Urology Program of the National Institute of Arthritis,
Metabolism and Digestive Diseases, National Institutes of Health.
13. Wennberg JE, Roos NP, Sola L, Schori A, Jaffe R. Use of claims data systems to
evaluate health care outcomes: Mortality and reoperation following prostatectomy.
Journal of the American Medical Association 1987;257:933-936.
OCR for page 66
66
LESLIE L. ROOS ET AL.
14. Cohen MM. Long-te~m risk of hysterectomy after tubal sterilization. American
Journal of Epidemiology 1987;125:410-419.
Ray WA, Griffin MR. Use of Medicaid data for pharmacoepidemiology. American
Journal of Epidemiology 1989; 129:837-849.
Kramer MS, Shapiro SH. Scientific challenges in the application of randomized tri-
als. Journal of the American Medical Association 1984;252:2739-2745.
Hertzman C. Morbidity studies: Are population-based data a useful benchmark for
studying morbidity in special groups? Canadian Journal of Public Health
1988;79:386-387.
Malenka DJ, Roos NP, Fisher ES, McLerran DF, Whaley FS, Barry MJ,
Bruskewitz R. Wennberg J. Further study of the increased mortality following
transurethral prostatectomy: A chart-based analysis. Journal of Urology in press.
19. Roos LL, Sharp SM, Cohen MM, Wajda A. Risk adjustment in claims-based
research: The search for efficient approaches. Journal of Clinical Epidemiology
1989;42: 1193-1206.
20. Barry MI, Mulley AG, Fowler FJ, Wennberg JE. Watchful waiting vs. immediate
transurethral resection for symptomatic prostatism: The importance of patients'
preferences. Journal of the American Medical Association 1988;259:3010-3017.
21. Roos NP, Lyttle D. Hip arthroplasty surgery in Manitoba: 1973-1978. Clinical
Orthopaedics 1985;199:248-255.
22. Blumberg MS. Risk adjusting health care outcomes: A methodologic review.
Medical Care Review 1986;43:351-393.
23. Daley J. Jencks S. Draper D, Lenhart G. Thomas N. WaLker J. Predicting hospital-
associated mortality for Medicare patients: A method for patients with stroke,
pneumonia, acute myocardial infarction, and congestive heart failure. Journal of
the American Medical Association 1988;260:3617-3624.
24. Cook TD, Campbell DT. Quasi-Experimentation. Chicago: Rand McNally, 1979.
25. Demlo LK, Campbell PM. Improving hospital discharge data: Lessons from the
National Hospital Discharge Survey. Medical Care 1981;19:1030-1040.
26. Hsia DC, Krushat WM, Fagan AB, Tebbutt JA, Kusserow RP. Accuracy of diag-
nostic coding for Medicare patients under the prospective-payment system. New
England Journal of Medicine 1988;318:352-355.
27. Roos LL, Sharp SM, Wajda A. Assessing data quality: A computerized approach.
Social Science and Medicine 1989;28:175-182.
28. Roos LL, Nicol JP, Johnson C, Roos NP. Using administrative data banks for
research and evaluation: A case study. Evaluation Quarterly 1979;3:236-255.
29. Roos LL, Roos NP, Cageorge SM, Nicol JP. How good are the data? Reliability of
one health care data bank. Medical Care 1982;20:266-276.
30. Davis H. Was surgery needed? The Baltimore Sun: April 6, 1986.
31. Roos NP, Flowerdew G. Wajda A, Tate RB. Variations in physicians' hospitaliza-
tion practices: A population-based study in Manitoba, Canada. American Journal
of Public Health 1986;76:45-51.
32. Mossey JM, Roos LL. Using insurance claims to measure health status: The illness
scale. Journal of Chronic Diseases (Suppl 1) 1987;40:41S-SOS.
33. Roos NP, Wennberg JE, McPherson K. Using diagnosis-related groups for studying
variations in hospital admissions. Health Care Financing Review 1988;9:53-62.
34. Diaz C, Starf~eld B. Holtzman N. Mellits ED, Hankin J. SmaLky K, Benson P. Ill
16.
17.
18.
OCR for page 67
HEALTH INSURANCE DATA SYSTEMS
67
health and use of medical care: Community-based assessment of morbidity in chil-
dren. Medical Care 1986;24:848-856.
35. Sloan FA, Perrin JM, Valvona J. In-hospital mortality of surgical patients: Is there
an empiric basis for standard setting? Surgery 1986;99:446~53.
36. Flood AB, Scott WR. Hospital Structure and Performance. Baltimore: Johns
Hopkins University Press, 1987.
37. Charlson ME, Pompei P. Ales KL, MacKenzie CR. A new method of classifying
prognostic comorbidity in longitudinal studies: Development and validation.
Journal of Chronic Diseases 1987;40:373-383.
38. Roos NP, Roos LL, Mossey JM, Havens BJ. Using administrative data to predict
important health outcomes: Entry to hospital, nursing home, and death. Medical
Care 1988;26:221-239.
39. Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of med-
ical tests. Journal of the American Medical Association 1982;247:2543-2546.
40. Showstack JA, Rosenfeld KE, Garnick DW, Luft HS, Schaffarzick RW, Fowles J.
Association of volume with outcome of coronary artery bypass graft surgery:
Scheduled vs. nonscheduled operations. Journal of the American Medical
Association 1987;257:785-789.
41 U.S. Congress, Office of Technology Assessment. The Quality of Medical Care:
Information for Consumers, OTA-H-386. Washington, D.C.: Government Printing
Off~ce, June 1988.
42. Berwick DM. Toward an applied technology for quality measurement in health
care. Medical Decision Making 1988;8:253-258.
43. Roos LL, Cageorge SM, Austen E, Lohr KN. Using computers to identify compli-
cations after surgery. American Journal of Public Health 1985;75:1288-1295.
44. Fowler FJ, Wennberg JE, Timothy RP, Barry MJ, Mulley AG, Hanley D. Symptom
status and the quality of life following prostatectomy. Journal of the American
Medical Association 1988;259:3018-3022.
45. Newcombe HB. Handbook of Record Linkage. New York: Oxford University
Press, 1988.
46. Cohen MM, Hammarstrand KM. Papanicolaou testing without a cytology registry.
American Journalof Epidemiology 1989;129:388-394.
47. Cohen MM, Duncan PG. Physical status score and trends in anesthetic complica-
tions. Journal of Clinical Epidemiology 1988;41 :83-90.
48. Wajda A, Roos LL. Simplifying record linkage: Software and strategy. Computers
in Biology and Medicine 1987;17:239-248.
49. Smith ME. Record linkage: Organizing the facts together. In Benneu BM, Trute B
(eds). Mental Health Information Systems: Problems and Prospects. New York:
Edwin Mellen Press, 1984:263-281.
50. Winslow CM, Kosecoff JB, Chassin M, Kanouse DE, Brook RH. The appropriate-
ness of performing coronary artery bypass surgery. Journal of the American
Medical Association 1988;260:505-509.
51. Winslow CM, Solomon DH, Chassin MR, Kosecoff J. Merrick NJ, Brook RH. The
appropriateness of carotid endarterectomy. New England Journal of Medicine
1988;318:722-727.
52. Krakauer H. The use of data abstracted from medical records to assess the effec-
tiveness of medical interventions. Manuscript, 1988.
Representative terms from entire chapter:
claims data