Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 152
18
Using Patient Reports of
Outcomes to Assess
Effectiveness of Medical Care
Paul D. Cleary
In this chapter, I provide a brief overview of some of our current and
recent work on the development and evaluation of outcomes measures based
on patient reports. First, I make some general comments about the way we
think about quality and effectiveness and discuss the range of outcomes that
we think are important to consider in these types of studies. I also discuss a
study recently conducted in six hospitals in California and Boston, includ-
ing some results from the use of both generic and disease-specific measures
in that study. I illustrate several of my points using data from patients with
total hip replacements and conclude with some observations that bear on the
strengths and weaknesses of these measures for assessing effectiveness and
outcomes in health care.
LINKING PROCESS AND OUTCOME
Donabedian has described three approaches to the assessment of the quality
of medical care: observation of structure, of process, and of outcomes (1~.
Most programs for evaluating quality focus primarily on process, and we
tend to use terms such as "consensus," "norms," "standards," "criteria," and
"appropriateness" when we describe the ways in which we evaluate process.
These are all part of our lexicon and are an integral part of the way we think
about quality.
Donabedian pointed out that we typically evaluate quality on the basis of
observations of process, but he also asserted that judgments of quality tend
to rest on what is known about the relationships between process and outcome.
I would emphasize that point and argue that evaluations of process and
outcomes are inextricably linked. Unless we know what the outcome of a
152
OCR for page 153
DEVELOPMENT AND USE OF OUTCOMES MEASURES
153
particular process is, we cannot determine whether it represents quality
care.
The purpose of measuring outcomes is to help establish the relationships
between process and outcomes. Little is learned by studying variations in
outcomes by themselves, and little is gained by developing better measure-
ment tools in isolation. These types of research activities are undertaken to
help develop a practical, valid model of the linkages between process and
outcomes so that we can improve quality of care.
EARLY STUDIES OF OUTCOME
Outcomes are so widely considered to be the ultimate indicator of quality
that it is surprising how infrequently we analyze them carefully. In the
1830s, a physician named Pierre-Charles-Alexandre Louis started a group
in Paris that discussed the use of statistics to examine patterns of medical
care. In 1838, a physician from that group named George Norris returned
to the United States and looked at 55 cases in which an amputation had
been performed. Norris found that 21 of the patients who had had an
amputation had died. This was an important finding, and it challenged
many people's assumptions about the dangers associated with amputation.
In subsequent work, Norris compared how surgery outcomes at the
Pennsylvania Hospital compared with those of hospitals in other cities and
counties. It is interesting that although we think of the publication of
mortality statistics as a very recent phenomenon, they have been published
on a hospital-specific basis for 150 years! It is disappointing that we have
not improved our methods for assessing the relationships among case mix,
process of care, and outcomes, given the long history of work in this area
and the importance of the issues.
THE NEED FOR DISEASE-SPECIFIC MEASURES
One of the central issues in outcomes assessment concerns the range of
outcomes that should be assessed and whether it is better to use disease-
specific or general measures or both. I think that we should definitely
include both measures in outcome batteries. A number of researchers have
argued that assessing disease-specific outcomes is not necessary, that if one
measures generic outcomes, specific measures will not help explain addi-
tional variance. I disagree strongly with that position. If we want to be
able to detect differences in outcomes that are related to the process of care,
it is usually necessary to include measures of outcomes specific to the
condition studied and, in some cases, specific to the process of care being
examined.
OCR for page 154
54
EFFECTIVENESS AND OUTCOMES IN HEALTlI CARE
The domain of general outcomes that we think are important to study
includes general health perceptions, disability, activities of daily living, role
performance, well-being, fatigue, cognitive functioning, and satisfaction with
care. It is not always necessary to measure all of these outcomes. Depending
on the application, one may want to measure only one or two dimensions.
It is often useful to have measures of all of the areas mentioned above, but
to develop a practical system that can be used for policy-related relevant
research and for quality assessment and assurance programs, it may be
necessary to be more selective with respect to the measures used.
SCOPE OF THE STUDY
The specific study I describe here is an investigation of variations in case
mix, patterns of care, and outcomes at six hospitals. The investigators
besides myself were Barbara McNeil, Sheldon Greenfield, Albert Mulley,
Steven Pauker, Steven Schroeder, and Lewis Wexler. The hospitals were
not-for-profit, university-affiliated teaching hospitals. Three of the hospitals
are in California and three in Boston. The sample was about 3,000 patients
receiving treatment for one of six medical or surgical conditions: acute
myocardial infarction (AMI); rule-out AMIs; total hip replacements;
cholecystectomies; coronary artery bypass graft surgery (CABG); and transurethral
prostatectomy (TURP).
During a one-year enrollment period, all eligible patients were sent a
letter after discharge explaining the purposes of the research and encourag-
ing them to take part in the study. For each patient who agreed to participate,
we obtained from the medical records data on disease severity, comorbid
conditions, and the process of care during the index hospitalization. Information
about sociodemographic characteristics, health-related quality of life before
and after hospitalization, perceived improvement in health status, health
care utilization, and satisfaction with care were collected using a self-ad-
ministered questionnaire mailed after discharge (2~. The timing of the fol-
low-up questionnaire was determined by a panel of experts and varied,
depending on the condition, from 3 to 12 months. Patients who did not
return the original questionnaire within two whelps were sent a second ques-
tionnaire, which was followed up with a telephone call reminding them to
return the questionnaire. Patients who still did not return a completed
questionnaire were interviewed over the telephone, when possible.
A medical record reviewer abstracted information about sociodemographic
characteristics, indicators of severity, comorbid conditions, surgical procedure,
occurrence of in-hospital complications, and use of services in the hospital
(for example, laboratory tests, days in the intensive-care unit, and so on).
To record information on comorbid conditions, we used the approach developed
OCR for page 155
DEVELOPMENT AND USE OF OUTCOMES MEASURES
155
by Greenfield and colleagues (3,4~. Another measure used was the physical
status classification of the American Society of Anesthesiologists (5~. Us-
ing the medical record, we coded disease-specific indicators of severity and
generic, as well as disease-specific, complications. To synthesize the infor-
mation on complications, a panel of experts selected a subset of complications
that they considered to be "serious." For these complications, we created
an index representing a count of the number of these complications experi-
enced by the patient.
PATIENT QUESTIONNAIRE
The outcome questionnaire asked about perceived general health, number
of days disabled, use of health services, symptoms related to the hip replacement,
current social activities, activities of daily living, well-being, satisfaction
with medical care and health, whether patients thought the operation made
them feel better, whether their health was better or worse than expected,
whether they felt "back to normal," employment, and role functioning, as
well as indicators of socioeconomic status such as education and income.
The questionnaires also had questions about condition-specific outcomes.
For example, the questionnaire sent to total hip replacement patients contained
questions about the amount of pain experienced doing a range of activities,
degree of limping, and use of walking supports. Finally, the questionnaire
also asked about daily activities, limping, use of walking supports, well-
being, employment, and role functioning in the month preceding surgery.
The measures of social activities, functioning, and well-being were adapted
from the Functional Status Questionnaire (6~. The pain scale was also
derived from a measure used by Jette and colleagues. The questions about
use of walking supports and limping were developed for this study. The
questionnaires were designed to be easy to read and answer, and took approxi-
mately 30 minutes to complete. The psychometric properties of the generic
components of this scale, when used with different groups of surgical patients,
have been described elsewhere (2~.
We collected billing data primarily to look at process. We extracted
comparable computerized information on 101 resource elements or process
of care variables at each of the hospitals. I discuss primarily data from the
patient questionnaire, but I want to emphasize again that those data are part
of a system of quality assessment.
Although many studies have used patient questionnaires to assess out-
comes, this study was unusual in a couple of important ways. The first is
that these patient questionnaires were administered a substantial period after
the hospitalization. The second is that we asked about these variables
before as well as after hospitalization. What we wanted to know about was
not just outcomes or simply variations in outcomes, but how postdischarge
OCR for page 156
156
EFFECTIVENESS AND OUTCOMES IN HEALTH CARE
health status differed from preadmission status and how those changes re-
lated to differences in the process of care.
I would like to emphasize that it is possible, in some cases, to get very
important information with a few simple questions. Although a single ques-
tion about general health status may appear to have limited validity to some
clinicians, empirical studies have shown that asking such a question can
elicit information related to clinical measures of health status.
Other types of issues are also easy to assess. For example, one question
on the health interview survey was, "During the past month, how many days
did illness or injury keep you in bed all or most of the day?" National data
on how people respond to this question are available, as are data from a
variety of studies in hospitals, clinics, and communities. A simple question
of this kind can be extremely informative.
When we talked to orthopedic surgeons and internists, they invariably
said that an important outcome for hip replacement patients is pain. They
operate for pain, they try to relieve pain, they try to get people back to
functioning without pain-and so we included a pain scale on our questionnaire.
It is important to note that the scales used in this study are in most cases
closely related to existing scales: it is not the content of the scales, but
rather the way they are applied, that is different.
Orthopedic surgeons also usually say that one of their primary goals is to
enable people to walk without support again. To assess this outcome, we
included simple questions: "What type of walking supports do you use
now? What kind of limp do you have now?" Again, this is not a long
battery, it is not complicated, it is very easy to answer, and it is very easy to
administer. Surgeons often do not know what proportion of their patients
are still limping a year after surgery or what proportion of their patients are
using walking supports, so they often find the responses to such questions
. ~ .
very Informative.
To assess psychological well-being, we used five items from the Functional
Status Questionnaire (6) that are the same as those used in the Medical
Outcomes Study. They include questions such as "Have you been a very
nervous person?" and "Have you felt calm and peaceful?" To measure role
functioning, we ask a series of questions about how the patient is doing at
work or at home.
In the six hospitals study, we asked about patient satisfaction using tradi-
tional questions such as "How satisfied were you with your hospital stay in
general?" I am now conducting a different study, the Picker/ Commonwealth
Study of Patient-Centered Care, in collaboration with Tom Delbanco at
Beth Israel Hospital in Boston, Tom Moloney at the Commonwealth Fund,
and a number of other colleagues at Harvard and the Commonwealth Fund.
We are collecting information from a national probability sample of about
6,000 patients nationally and 2,000 of their caregivers and asking them very
OCR for page 157
DEVELOPMENT AND USE OF OUTCOMES MEASURES
157
specific questions about the process of care that we think one would want to
know about when evaluating quality and effectiveness.
We do not usually think of patient satisfaction as a measure of outcome,
but I think after hospitalization we would like one of the outcomes to be an
informed, involved, cooperative patient. Thus, in the Picker/Commonwealth
Study, we ask a series of questions such as: "Were you involved in the
decisions about your care as much as you wanted?" "Were the important
side effects of the medicines that you were getting explained to you in a
way you could understand?" and so on. We have a sample of 62 hospitals
nationally, and I think we will be able to make some very interesting observations
about the differences among hospitals. That study is almost completed and
the results should be published this fall.
FINDINGS
One of the first things we wanted to know about our outcomes study was
whether it is feasible to distribute a questionnaire like ours to patients from
multiple institutions. We found that it was a very practical way of collect-
ing information. The questionnaire we used was 30 minutes long; we probably
could make it much shorter. Patient acceptance was high. In most surveys
there tends to be a reluctance to participate, but in this study there was a
great deal of interest in the study. Rather than feeling burdened by the
questionnaire, many patients reported that they were pleased that the hospital
was checking on how they were doing. We got a response rate of approxi-
mately 80 percent. About 10 percent of patients said they do not want to
participate in a research study, and about 90 percent of the remaining patients
returned a usable questionnaire.
It is important that measures be reliable and valid. Our scales were very
reliable, with coefficients ranging from 0.64 to 0.92. For the more established
scales, the reliabilities were quite high. Data on the correlations among
measures and the correlations with other health measures indicate that ours
are valid measures of health status. One common concern is that these
scales may reflect, to a great extent, differences in general psychological
well-being: that is, if patients are depressed, they will say they are doing
poorly; if they are feeling good, they will say they are doing well. We did
not find that to be the case in our study, probably because we focus on
questions that are as concrete as possible. Questions about specific activities,
such as limping and the use of walking supports, are less likely to be
confounded.
Another important feature of our health status measures is their responsiveness
to changes in health status. For most of the conditions we studied, there is
a ceiling effect, that is, everyone is doing well and the observer cannot see
any difference. For total hip replacement patients, however, there was a
OCR for page 158
158
EFFECTIVENESS AND OUTCOMES IN HEALTH CARE
dramatic improvement in functioning. One could see that the changes are
similar to what a clinician would predict after hospitalization, patients'
basic activities are largely back to normal.
An important question is whether it is necessary to measure different
dimensions separately or whether it is possible to use a combined measure.
The data demonstrate why I think it is better to measure components separately.
If one measures intermediate activities separately from basic activities, one
can see a very different pattern emerge. Patients who have had a total hip
replacement still show quite dramatic improvements, but there is a slightly
different picture for patients who have had CABG surgery. These patients
show a very strong and statistically significant improvement in functioning
on the intermediate level that we would not have picked up with a basic
. . .
activities scale.
The data on work performance also show a different pattern. With total
hip replacement patients there is a dramatic improvement in performance.
That contrasts with the perplexing but fairly consistent clinical finding that
CABG patients do have impaired work performance and do not return to
work as much as one would expect them to.
Among the AMI and rule-out AMI patients, postdischarge functioning is
worse than predischarge. Again, a separate scale picks up an important
phenomenon that I think would have been obscured in a combined measure.
The data on psychological well-being indicate that most patients are doing
pretty well, and everyone shows slight improvement.
It is difficult to describe the relative improvement across these scales. I
have taken each scale and calculated an improvement score, which is basically
how they are doing before hospitalization minus how they are doing later,
divided by the standard deviation of the change. Using this statistic as a
gauge of responsiveness, we find that the question about limping gives us
the best sense of how people are doing. Use of walking supports is not
quite as good. As we would expect, there are big improvements in intermediate
and basic activities: among hip replacement patients the basic and intermediate
scores show a .78 correction. This provides more evidence that in the
future we might be able to shorten our questionnaire.
The mental health scores did not show much change, and I frequently
hear an argument that one should not include social or psychological components
that are not directly related to the condition being studied. I would make a
plea for not discounting such measures so quickly. First of all, mental
health is a very, very important component of case mix. Another reason is
that it may be very important for interpreting good and bad outcomes. For
example, we are now engaged in analyzing older and younger patients, and
we have found that mental health status is related to perceived health status
in both groups and that it may be critical in determining differences.
OCR for page 159
DEVELOPMENT AND USE OF OUTCOMES MEASURES
CONCLUSIONS
159
The first conclusion from these data is that we have adequate measures
for most constructs. We have available a series of comprehensive batteries
or instruments. They are brief and can be made briefer. They are acceptable
to patients. They meet or exceed our normal standards for reliability. They
are very valid, and they are responsive to changes in health status.
Outcomes assessment should be an integral part of quality assurance
activities because it is not possible to assess fully the quality of processes
of care without data on associated outcomes. There has been a substantial
amount of research on how to assess case mix, and there are many systems
for monitoring the process of care. A fair amount is now known about
variations in certain outcomes, such as mortality, and I think the main
factor that limits us at this point is a lack of understanding about the linkages
among case mix, process of care, and outcomes.
I would argue that if we understood these linkages better, cost containment
and regulation would again become an administrative inconvenience rather
than a threat to the practice of medicine as we know it today, a frequently
expressed concern. We have the tools; we have the creativity; and we have
the will to address these issues. It is up to us to seize the day.
REFERENCES
1. Donabedian, A. Explorations in Quality Assessment and Monitoring, Volume
I. The Definition of Quality and Approaches to Its Assessment. Ann Arbor, MI: Health
Administration Press, 1980.
2. Cleary, P.D., Greenfield, S., and McNeil, B.J.
After Surgery. Controlled Clinical Trials, in press.
Assessing Quality of Life
3. Greenfield, S., Blanco, D.M., Elashoff, R.M., et al. Patterns of Care Related
to Age of Breast Cancer Patients. Journal of the American Medical Association
257:2766-2770, 1987.
4. Greenfield, S., Aronow, H.U., Elashoff, R.M., et al. Flaws In Mortality Data:
The Hazards of Ignoring Comorbid Disease. Journal of the American Medical As-
sociation 260:2253-2255, 1988.
5. Owens, W.D., Felts, J.A., and Spitznagel, E.L., Jr. ASA Physical Status
Classifications: A Study of Consistency of Ratings. Anesthesiology 49:239-243, 1978.
6. Jette, A.M., Davies, A.R., Cleary, P.D., et al. The Functional Status Questionnaire:
Reliability and Validity When Used in Primary Care. Journal of General Internal Medicine
1:143-149, 1986.
Representative terms from entire chapter:
walking supports