Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 152
18 Using Patient Reports of Outcomes to Assess Effectiveness of Medical Care Paul D. Cleary In this chapter, I provide a brief overview of some of our current and recent work on the development and evaluation of outcomes measures based on patient reports. First, I make some general comments about the way we think about quality and effectiveness and discuss the range of outcomes that we think are important to consider in these types of studies. I also discuss a study recently conducted in six hospitals in California and Boston, includ- ing some results from the use of both generic and disease-specific measures in that study. I illustrate several of my points using data from patients with total hip replacements and conclude with some observations that bear on the strengths and weaknesses of these measures for assessing effectiveness and outcomes in health care. LINKING PROCESS AND OUTCOME Donabedian has described three approaches to the assessment of the quality of medical care: observation of structure, of process, and of outcomes (1~. Most programs for evaluating quality focus primarily on process, and we tend to use terms such as "consensus," "norms," "standards," "criteria," and "appropriateness" when we describe the ways in which we evaluate process. These are all part of our lexicon and are an integral part of the way we think about quality. Donabedian pointed out that we typically evaluate quality on the basis of observations of process, but he also asserted that judgments of quality tend to rest on what is known about the relationships between process and outcome. I would emphasize that point and argue that evaluations of process and outcomes are inextricably linked. Unless we know what the outcome of a 152
OCR for page 153
DEVELOPMENT AND USE OF OUTCOMES MEASURES 153 particular process is, we cannot determine whether it represents quality care. The purpose of measuring outcomes is to help establish the relationships between process and outcomes. Little is learned by studying variations in outcomes by themselves, and little is gained by developing better measure- ment tools in isolation. These types of research activities are undertaken to help develop a practical, valid model of the linkages between process and outcomes so that we can improve quality of care. EARLY STUDIES OF OUTCOME Outcomes are so widely considered to be the ultimate indicator of quality that it is surprising how infrequently we analyze them carefully. In the 1830s, a physician named Pierre-Charles-Alexandre Louis started a group in Paris that discussed the use of statistics to examine patterns of medical care. In 1838, a physician from that group named George Norris returned to the United States and looked at 55 cases in which an amputation had been performed. Norris found that 21 of the patients who had had an amputation had died. This was an important finding, and it challenged many people's assumptions about the dangers associated with amputation. In subsequent work, Norris compared how surgery outcomes at the Pennsylvania Hospital compared with those of hospitals in other cities and counties. It is interesting that although we think of the publication of mortality statistics as a very recent phenomenon, they have been published on a hospital-specific basis for 150 years! It is disappointing that we have not improved our methods for assessing the relationships among case mix, process of care, and outcomes, given the long history of work in this area and the importance of the issues. THE NEED FOR DISEASE-SPECIFIC MEASURES One of the central issues in outcomes assessment concerns the range of outcomes that should be assessed and whether it is better to use disease- specific or general measures or both. I think that we should definitely include both measures in outcome batteries. A number of researchers have argued that assessing disease-specific outcomes is not necessary, that if one measures generic outcomes, specific measures will not help explain addi- tional variance. I disagree strongly with that position. If we want to be able to detect differences in outcomes that are related to the process of care, it is usually necessary to include measures of outcomes specific to the condition studied and, in some cases, specific to the process of care being examined.
OCR for page 154
54 EFFECTIVENESS AND OUTCOMES IN HEALTlI CARE The domain of general outcomes that we think are important to study includes general health perceptions, disability, activities of daily living, role performance, well-being, fatigue, cognitive functioning, and satisfaction with care. It is not always necessary to measure all of these outcomes. Depending on the application, one may want to measure only one or two dimensions. It is often useful to have measures of all of the areas mentioned above, but to develop a practical system that can be used for policy-related relevant research and for quality assessment and assurance programs, it may be necessary to be more selective with respect to the measures used. SCOPE OF THE STUDY The specific study I describe here is an investigation of variations in case mix, patterns of care, and outcomes at six hospitals. The investigators besides myself were Barbara McNeil, Sheldon Greenfield, Albert Mulley, Steven Pauker, Steven Schroeder, and Lewis Wexler. The hospitals were not-for-profit, university-affiliated teaching hospitals. Three of the hospitals are in California and three in Boston. The sample was about 3,000 patients receiving treatment for one of six medical or surgical conditions: acute myocardial infarction (AMI); rule-out AMIs; total hip replacements; cholecystectomies; coronary artery bypass graft surgery (CABG); and transurethral prostatectomy (TURP). During a one-year enrollment period, all eligible patients were sent a letter after discharge explaining the purposes of the research and encourag- ing them to take part in the study. For each patient who agreed to participate, we obtained from the medical records data on disease severity, comorbid conditions, and the process of care during the index hospitalization. Information about sociodemographic characteristics, health-related quality of life before and after hospitalization, perceived improvement in health status, health care utilization, and satisfaction with care were collected using a self-ad- ministered questionnaire mailed after discharge (2~. The timing of the fol- low-up questionnaire was determined by a panel of experts and varied, depending on the condition, from 3 to 12 months. Patients who did not return the original questionnaire within two whelps were sent a second ques- tionnaire, which was followed up with a telephone call reminding them to return the questionnaire. Patients who still did not return a completed questionnaire were interviewed over the telephone, when possible. A medical record reviewer abstracted information about sociodemographic characteristics, indicators of severity, comorbid conditions, surgical procedure, occurrence of in-hospital complications, and use of services in the hospital (for example, laboratory tests, days in the intensive-care unit, and so on). To record information on comorbid conditions, we used the approach developed
OCR for page 155
DEVELOPMENT AND USE OF OUTCOMES MEASURES 155 by Greenfield and colleagues (3,4~. Another measure used was the physical status classification of the American Society of Anesthesiologists (5~. Us- ing the medical record, we coded disease-specific indicators of severity and generic, as well as disease-specific, complications. To synthesize the infor- mation on complications, a panel of experts selected a subset of complications that they considered to be "serious." For these complications, we created an index representing a count of the number of these complications experi- enced by the patient. PATIENT QUESTIONNAIRE The outcome questionnaire asked about perceived general health, number of days disabled, use of health services, symptoms related to the hip replacement, current social activities, activities of daily living, well-being, satisfaction with medical care and health, whether patients thought the operation made them feel better, whether their health was better or worse than expected, whether they felt "back to normal," employment, and role functioning, as well as indicators of socioeconomic status such as education and income. The questionnaires also had questions about condition-specific outcomes. For example, the questionnaire sent to total hip replacement patients contained questions about the amount of pain experienced doing a range of activities, degree of limping, and use of walking supports. Finally, the questionnaire also asked about daily activities, limping, use of walking supports, well- being, employment, and role functioning in the month preceding surgery. The measures of social activities, functioning, and well-being were adapted from the Functional Status Questionnaire (6~. The pain scale was also derived from a measure used by Jette and colleagues. The questions about use of walking supports and limping were developed for this study. The questionnaires were designed to be easy to read and answer, and took approxi- mately 30 minutes to complete. The psychometric properties of the generic components of this scale, when used with different groups of surgical patients, have been described elsewhere (2~. We collected billing data primarily to look at process. We extracted comparable computerized information on 101 resource elements or process of care variables at each of the hospitals. I discuss primarily data from the patient questionnaire, but I want to emphasize again that those data are part of a system of quality assessment. Although many studies have used patient questionnaires to assess out- comes, this study was unusual in a couple of important ways. The first is that these patient questionnaires were administered a substantial period after the hospitalization. The second is that we asked about these variables before as well as after hospitalization. What we wanted to know about was not just outcomes or simply variations in outcomes, but how postdischarge
OCR for page 156
156 EFFECTIVENESS AND OUTCOMES IN HEALTH CARE health status differed from preadmission status and how those changes re- lated to differences in the process of care. I would like to emphasize that it is possible, in some cases, to get very important information with a few simple questions. Although a single ques- tion about general health status may appear to have limited validity to some clinicians, empirical studies have shown that asking such a question can elicit information related to clinical measures of health status. Other types of issues are also easy to assess. For example, one question on the health interview survey was, "During the past month, how many days did illness or injury keep you in bed all or most of the day?" National data on how people respond to this question are available, as are data from a variety of studies in hospitals, clinics, and communities. A simple question of this kind can be extremely informative. When we talked to orthopedic surgeons and internists, they invariably said that an important outcome for hip replacement patients is pain. They operate for pain, they try to relieve pain, they try to get people back to functioning without pain-and so we included a pain scale on our questionnaire. It is important to note that the scales used in this study are in most cases closely related to existing scales: it is not the content of the scales, but rather the way they are applied, that is different. Orthopedic surgeons also usually say that one of their primary goals is to enable people to walk without support again. To assess this outcome, we included simple questions: "What type of walking supports do you use now? What kind of limp do you have now?" Again, this is not a long battery, it is not complicated, it is very easy to answer, and it is very easy to administer. Surgeons often do not know what proportion of their patients are still limping a year after surgery or what proportion of their patients are using walking supports, so they often find the responses to such questions . ~ . very Informative. To assess psychological well-being, we used five items from the Functional Status Questionnaire (6) that are the same as those used in the Medical Outcomes Study. They include questions such as "Have you been a very nervous person?" and "Have you felt calm and peaceful?" To measure role functioning, we ask a series of questions about how the patient is doing at work or at home. In the six hospitals study, we asked about patient satisfaction using tradi- tional questions such as "How satisfied were you with your hospital stay in general?" I am now conducting a different study, the Picker/ Commonwealth Study of Patient-Centered Care, in collaboration with Tom Delbanco at Beth Israel Hospital in Boston, Tom Moloney at the Commonwealth Fund, and a number of other colleagues at Harvard and the Commonwealth Fund. We are collecting information from a national probability sample of about 6,000 patients nationally and 2,000 of their caregivers and asking them very
OCR for page 157
DEVELOPMENT AND USE OF OUTCOMES MEASURES 157 specific questions about the process of care that we think one would want to know about when evaluating quality and effectiveness. We do not usually think of patient satisfaction as a measure of outcome, but I think after hospitalization we would like one of the outcomes to be an informed, involved, cooperative patient. Thus, in the Picker/Commonwealth Study, we ask a series of questions such as: "Were you involved in the decisions about your care as much as you wanted?" "Were the important side effects of the medicines that you were getting explained to you in a way you could understand?" and so on. We have a sample of 62 hospitals nationally, and I think we will be able to make some very interesting observations about the differences among hospitals. That study is almost completed and the results should be published this fall. FINDINGS One of the first things we wanted to know about our outcomes study was whether it is feasible to distribute a questionnaire like ours to patients from multiple institutions. We found that it was a very practical way of collect- ing information. The questionnaire we used was 30 minutes long; we probably could make it much shorter. Patient acceptance was high. In most surveys there tends to be a reluctance to participate, but in this study there was a great deal of interest in the study. Rather than feeling burdened by the questionnaire, many patients reported that they were pleased that the hospital was checking on how they were doing. We got a response rate of approxi- mately 80 percent. About 10 percent of patients said they do not want to participate in a research study, and about 90 percent of the remaining patients returned a usable questionnaire. It is important that measures be reliable and valid. Our scales were very reliable, with coefficients ranging from 0.64 to 0.92. For the more established scales, the reliabilities were quite high. Data on the correlations among measures and the correlations with other health measures indicate that ours are valid measures of health status. One common concern is that these scales may reflect, to a great extent, differences in general psychological well-being: that is, if patients are depressed, they will say they are doing poorly; if they are feeling good, they will say they are doing well. We did not find that to be the case in our study, probably because we focus on questions that are as concrete as possible. Questions about specific activities, such as limping and the use of walking supports, are less likely to be confounded. Another important feature of our health status measures is their responsiveness to changes in health status. For most of the conditions we studied, there is a ceiling effect, that is, everyone is doing well and the observer cannot see any difference. For total hip replacement patients, however, there was a
OCR for page 158
158 EFFECTIVENESS AND OUTCOMES IN HEALTH CARE dramatic improvement in functioning. One could see that the changes are similar to what a clinician would predict after hospitalization, patients' basic activities are largely back to normal. An important question is whether it is necessary to measure different dimensions separately or whether it is possible to use a combined measure. The data demonstrate why I think it is better to measure components separately. If one measures intermediate activities separately from basic activities, one can see a very different pattern emerge. Patients who have had a total hip replacement still show quite dramatic improvements, but there is a slightly different picture for patients who have had CABG surgery. These patients show a very strong and statistically significant improvement in functioning on the intermediate level that we would not have picked up with a basic . . . activities scale. The data on work performance also show a different pattern. With total hip replacement patients there is a dramatic improvement in performance. That contrasts with the perplexing but fairly consistent clinical finding that CABG patients do have impaired work performance and do not return to work as much as one would expect them to. Among the AMI and rule-out AMI patients, postdischarge functioning is worse than predischarge. Again, a separate scale picks up an important phenomenon that I think would have been obscured in a combined measure. The data on psychological well-being indicate that most patients are doing pretty well, and everyone shows slight improvement. It is difficult to describe the relative improvement across these scales. I have taken each scale and calculated an improvement score, which is basically how they are doing before hospitalization minus how they are doing later, divided by the standard deviation of the change. Using this statistic as a gauge of responsiveness, we find that the question about limping gives us the best sense of how people are doing. Use of walking supports is not quite as good. As we would expect, there are big improvements in intermediate and basic activities: among hip replacement patients the basic and intermediate scores show a .78 correction. This provides more evidence that in the future we might be able to shorten our questionnaire. The mental health scores did not show much change, and I frequently hear an argument that one should not include social or psychological components that are not directly related to the condition being studied. I would make a plea for not discounting such measures so quickly. First of all, mental health is a very, very important component of case mix. Another reason is that it may be very important for interpreting good and bad outcomes. For example, we are now engaged in analyzing older and younger patients, and we have found that mental health status is related to perceived health status in both groups and that it may be critical in determining differences.
OCR for page 159
DEVELOPMENT AND USE OF OUTCOMES MEASURES CONCLUSIONS 159 The first conclusion from these data is that we have adequate measures for most constructs. We have available a series of comprehensive batteries or instruments. They are brief and can be made briefer. They are acceptable to patients. They meet or exceed our normal standards for reliability. They are very valid, and they are responsive to changes in health status. Outcomes assessment should be an integral part of quality assurance activities because it is not possible to assess fully the quality of processes of care without data on associated outcomes. There has been a substantial amount of research on how to assess case mix, and there are many systems for monitoring the process of care. A fair amount is now known about variations in certain outcomes, such as mortality, and I think the main factor that limits us at this point is a lack of understanding about the linkages among case mix, process of care, and outcomes. I would argue that if we understood these linkages better, cost containment and regulation would again become an administrative inconvenience rather than a threat to the practice of medicine as we know it today, a frequently expressed concern. We have the tools; we have the creativity; and we have the will to address these issues. It is up to us to seize the day. REFERENCES 1. Donabedian, A. Explorations in Quality Assessment and Monitoring, Volume I. The Definition of Quality and Approaches to Its Assessment. Ann Arbor, MI: Health Administration Press, 1980. 2. Cleary, P.D., Greenfield, S., and McNeil, B.J. After Surgery. Controlled Clinical Trials, in press. Assessing Quality of Life 3. Greenfield, S., Blanco, D.M., Elashoff, R.M., et al. Patterns of Care Related to Age of Breast Cancer Patients. Journal of the American Medical Association 257:2766-2770, 1987. 4. Greenfield, S., Aronow, H.U., Elashoff, R.M., et al. Flaws In Mortality Data: The Hazards of Ignoring Comorbid Disease. Journal of the American Medical As- sociation 260:2253-2255, 1988. 5. Owens, W.D., Felts, J.A., and Spitznagel, E.L., Jr. ASA Physical Status Classifications: A Study of Consistency of Ratings. Anesthesiology 49:239-243, 1978. 6. Jette, A.M., Davies, A.R., Cleary, P.D., et al. The Functional Status Questionnaire: Reliability and Validity When Used in Primary Care. Journal of General Internal Medicine 1:143-149, 1986.
Representative terms from entire chapter: