Distribution of Rewards
The preceding chapter discussed the advantages and disadvantages of several approaches to funding pay for performance within the Medicare program and offered rationales for both short- and long-term strategies. This chapter focuses on how rewards could be distributed to providers. Also discussed are general guidelines for designing an incentive-based system, such as what aspects of care to reward, what measures to reward, and how large payments must be to have the desired effect. Issues dealing with the implementation of such a system are assessed in Chapter 5.
As summarized in Chapter 2, the current fee-for-service payment system has both strengths and weaknesses. Among its weaknesses is that it provides incentives for overuse of services and fails to impose systematic penalties for misuse or underuse of medical care. Embedded in the system, moreover, are incentives to use certain procedures over other, less costly ones that may be equally or more effective. Together these weaknesses create an environment in which the provision of higher-quality care at lower cost is not the standard.
The health care delivery system has evolved over time with better understanding of diseases and the human body and the development of new technologies and procedures. Provision of medical care is significantly different from what it was 40 years ago when Medicare began; yet the fee-for-service payment system has changed little, except for the replacement of cost-based prospective payments. Attempts to modify the payment system—such as efforts to encourage coordinated or managed care—have been met with limited success. Pay for performance is a critical tool that can, if implemented carefully, begin to address the undesirable consequences of the fee-for-service system.
This chapter addresses options for distributing rewards to high-performing providers. First it addresses the question of just what should be
rewarded under a pay-for-performance program. Next it examines various design elements, such as how to measure performance and what basis to use for the distribution of rewards. The chapter then looks at how to operationalize these design elements, describing various models for assigning and distributing rewards to providers. There are many nuances involved in the design of any pay-for-performance program. The discussion in this chapter is intended to illustrate the challenges designers will face.
WHAT TO REWARD
Identifying Domains of Care
Pay for performance should provide incentives for delivering higher-quality care to achieve all six aims for health care identified in the Institute of Medicine’s (IOM’s) Quality Chasm report: safety, effectiveness, patient-centeredness, timeliness, efficiency, and equity (IOM, 2001). Many physicians and health care organizations are skeptical that reliable and valid performance measures can be developed for complex clinical processes. They are equally doubtful that payment incentives can be put in place that will reward performance in ways that affect what truly matters: improving the health of patients. Another major challenge facing designers of pay-for-performance programs is guarding against the possibility that efforts to improve one domain of care may adversely affect other domains. For example, many purchasers are concerned that performance measures emphasizing enhanced clinical quality will lead to an unrestrained growth in costs and a minimal effort to reduce current waste and inefficiencies.
Any pay-for-performance program must address these concerns by clarifying the goals and objectives of new payment mechanisms. In considering how to do so, the committee drew on the vision set forth in the Quality Chasm report, in particular the six aims cited above. Current performance measurement approaches are focused heavily on clinical effectiveness. While this domain is crucial to improving the overall quality of health care, an overemphasis on clinical effectiveness risks defining good care too narrowly by failing to consider the perspectives of patients, their families, and society as a whole, as well as limitations in resource availability. Pay for performance should be based on performance measures that are aligned with long-term goals for improving all aspects of quality that foster improved patient outcomes within an environment of limited resources. In its consideration of initial measures that would ensure high quality and improve the value of health care investments, the committee found it convenient to consolidate the six aims of the Quality Chasm report into three broader domains that should serve as the foundation for new payment incentives: clinical quality, patient-centered care, and efficiency.
Recommendation 4: In designing a pay-for-performance program, the Secretary of DHHS should initially reward health care that is of high clinical quality, patient-centered, and efficient.
The committee believes efforts to improve quality should focus initially on these three broad domains, eventually disaggregating the focus to ensure that all six quality aims are adequately addressed. This consolidation into three domains should not be viewed as diminishing the value of the six aims. Rather, this initial approach is intended to streamline the complex task of implementing pay for performance while at the same time ensuring that performance is considered comprehensively.
There are numerous ways a reward pool could be divided among the three domains of clinical quality, patient-centeredness, and efficiency. The following discussion illustrates simplified versions of the options the committee considered.
The total dollars in the reward pool could be split evenly across the domains. For instance, if $3 billion were available to distribute, $1 billion could be allocated for achievements in each domain.
Even distribution would signal that policy makers regarded improved performance in all three domains to be equally important. However, this option may not be advisable if robust and sufficient measures are unavailable for all domains.
A second option would be to distribute the rewards unevenly, emphasizing some domains over others. For example, $1.5 billion could be designated to reward clinical quality, $1 billion patient-centeredness, and $0.5 billion efficiency.
This approach would be appropriate if policy makers deemed improvement in certain domains more important than that in others. This option might also be advisable if the validity and comprehensiveness of measures across domains differed or if larger incentives were found to be necessary to motivate equal improvement in all three domains.
Little objective evidence exists to inform a judgment about the appropriate distribution of rewards across the three domains. Nor is there a clear consensus on the relative priorities for improvement in each domain to guide the allocation decision. Practical considerations—namely, that there are few well-developed measures available for patient-centeredness and efficiency (see the discussion below)—led the committee to conclude that, initially at least, most of the reward pool should be allocated to incentives for improved clinical quality, where applicable. Improvements in each domain would ideally be made with consideration of the others, with the goal of improving all three. The Secretary of the Department of Health and Human Services (DHHS) should decide exactly how much is to be allocated to each domain. The distribution of payment across the domains should be adjusted as the program develops. As measures and pubic reporting initiatives supporting a pay-for-performance program evolve, policies, and therefore reimbursement levels, will also need to be adjusted.
Improvement and Excellence
Pay-for-performance programs have generally been structured to reward one or both of two possible dimensions: improvement and excellence.
Under a system that rewarded improvement, providers would be eligible for a reward if their performance improved significantly. Improvement is measured in many ways; one approach that is employed currently
by the Centers for Medicare and Medicaid Services (CMS) is the percentage reduction in the failure rate, defined as the difference between perfect and actual performance.1 Under this method, even very high-performing providers would have the ability to earn rewards because the higher a provider’s initial performance was, the smaller the absolute increase in performance would have to be to achieve any particular decrease in the failure rate. For example, if the baseline performance measure for hospital A were 80 percent, its failure rate would be 20 percent. If 1 year later hospital A’s performance had improved by 4 percent to 84 percent, the reduction in its failure rate would be 20 percent. Hospital B, with an initial performance score of 40 percent (and a failure rate of 60 percent) would have to improve its performance by 12 percent to 52 percent to achieve a 20 percent reduction in its failure rate. However, the reduction in failure rate is not a perfect measure. It might take more resources and be more difficult for hospital A to improve performance than for hospital B. And even if possible, 100 percent may not be desirable.
A system designed to reward excellence would reward only those providers who attained or exceeded a specified threshold of performance. In other words, only those that truly were among the best performers would be rewarded for delivering high-quality care. The threshold could be an aggregate or average of several measures. There could also be several thresholds, all or some of which would have to be met to receive any award.
Analysis and Conclusion
Providing rewards for improvement has the advantage of offering incentives to all providers to improve their performance. A criticism of an approach that rewards only improvement, however, is that some providers with truly excellent performance would receive no rewards, while others who even after significant improvement were performing at a mediocre level would benefit. This situation might persist only for a few years if the required levels of improvement were significant (e.g., an annual 15 percent reduction in the failure rate) because after several years of such sizable improvements, initially poor performers would by default become high performers. Another concern with rewarding only improvement is gaming. For example, it would be necessary to guard against a situation in which a mediocre performer earned a reward for improving significantly in year 1,
then let performance slide in year 2, and again obtained a reward in year 3 for significant improvement (much of which had been rewarded in the first year). To prevent this, the reduction in the failure rate could be measured from the provider’s previous highest level of performance.
If the system rewarded only excellence, providers that were well below the threshold for excellence might conclude that, because they had little chance of reaching the threshold, investments to improve specific areas of quality would not be worth the effort or cost. Another concern with rewarding only excellence is that rewards, at least initially, might be concentrated geographically, which could undercut support for a pay-for-performance initiative. On the other hand, a strength of a system that rewarded excellence would be that it would send a clear signal as to what the nation expects from a high-performing health care system.
Recognizing the advantages and limitations of each approach, the committee concluded that a combined approach should be taken:
Recommendation 5: The Secretary of DHHS should design a pay-for-performance program that initially rewards both providers who improve performance significantly and those who achieve high performance.
Under such a combined approach, providers at all levels would be most likely to find at least one of the two goals within reach. The committee expects that the distribution of the aggregate reward pool between encouraging improvement and rewarding excellence will shift over time. As providers make significant improvements in their performance, the fraction of the pool devoted to rewarding excellence should grow. This shift may occur rather slowly because as the vast majority of providers reach and sustain good performance according to certain measures, new measures focusing on different aspects of quality should be introduced into the system. This shift should be monitored and distribution adjusted in evaluations of any pay-for-performance program (see Chapter 6).
WHAT MEASURES SHOULD BE USED FOR REWARD-BASED PAYMENTS
Identifying Measure Sets for Assessing Performance
How payments are distributed within each of the three domains depends in part on the measures employed. Certain disease conditions or care settings may be preferable starting points because of the availability of reliable measures and the expectation that significant improvements in performance can be achieved.
If rewards are to be based on provider performance, the system must be capable of reliably defining and transparently identifying good care. However, the IOM report Performance Measurement: Accelerating Improvement (IOM, 2006) identified gaps in currently available measures. First, measures for some of the six quality aims set forth in the Quality Chasm report (IOM, 2001) are lacking; current measures focus largely on effectiveness. Second, currently available measures do not adequately reflect health care across the life span; for example, few measures adequately characterize care at the end of life, as compared with the many measures for living with chronic disease. A third limitation is that measures—which are usually categorized into indicators of structure, process, or outcomes—are heavily process-based (Bradley et al., 2006). Structural measures evaluate the physical structures associated with care delivery, process measures assess how care is actually delivered, and outcome measures consider the health of a patient as a result of care received. To better characterize health care delivery, more measures that can capture relationships among structure, process, and outcomes need to be developed (IOM, 2006). Because only a small portion of care delivery is currently being measured, pay for performance necessarily will be limited initially to rewarding specific subsets of care.
Available measures for each of the three domains are far from being adequate to support a comprehensive pay-for-performance program. Clinical quality measures are currently further along than those for patient-centeredness, while efficiency measures are still largely under development. Existing measure sets are organized largely by care setting. Building on the starter set of measures presented in the Performance Measurement report, the committee believes the measures presented in Table 4-1 should be used in the short term for a pay-for-performance program with the exception of the Minimum Data Set, which should not be used in pay for performance to provide incentives for skilled nursing facilities (see Chapter 5). As discussed in the Performance Measurement report, the following criteria were considered in choosing the starter set:
Given that current measures were developed by care setting, rewards will initially have to be distributed by setting until more measures across sites of care and across time are developed. These measures were current as of August 2005 and should be considered the minimum for reporting. Upon
TABLE 4-1 Recommended Starter Set of Performance Measures
Ambulatory care Quality Alliance (26)
Ambulatory Care Survey
CAHPSb Clinician and Group Survey: getting care quickly, getting needed care, how well providers communicate, health promotion and education, shared decision making, knowledge of medical history, how well office staff communicate
Hospital Quality Alliance (22)
Structural measures (computerized provider order entry, intensive care unit intensivists, evidence-based hospital referrals)
Patient communication with physicians, patient communication with nurses, responsiveness of hospital staff, cleanliness/noise level of physical environment, pain control, communications about medicines, discharge information
Health Plans and Accountable Health Organizations
Health Plan Employer Data and Information Set (HEDIS) (61)
Integrated delivery systems (health maintenance organizations): effectiveness (26), access/availability of care (8), satisfaction with the experience of care (4), health plan stability (2), use of service (15), cost of care, informed health care choices, health plan descriptive information (6)
Preferred provider organizations within Medicare Advantage: selected administrative data and hybrid measures
Ambulatory Care Survey
CAHPS Health Plan Survey: getting care quickly, getting needed care, how well providers communicate, health plan paperwork, health plan customer service
Minimum Data Set (15)
Long-term care (12), short-stay care (3)
Outcome and Assessment Information Set (11)
Ambulation/locomotion (1), transferring (1), toileting (1), pain (1), bathing (2), management of oral medications (1), acute care hospitalization (1), emergent care (1), confusion (1)
End-Stage Renal Disease
National Healthcare Quality Report (5)
Transplant registry and results (2), dialysis effectiveness (2), mortality (1)
Longitudinal measures of outcomes and efficiency
1-year mortality, resource use, and functional status (SF-12) after acute myocardial infarction
aThe committee recommends the aggregation of individual measures to patient-level composites for these areas.
bCAHPS = Consumer Assessment of Healthcare Providers and Systems.
SOURCE: IOM, 2006.
implementation of a pay-for-performance program, these measures should be updated to reflect the most up-to-date research.
Data are collected using a variety of methods, primary among these being administrative claims and medical chart review. Retrievable electronic data are most frequently collected as administrative claims (commonly referred to as claims data or admin data), which are electronic medical bills submitted by providers to payers. Data derived from these electronic data include demographic information (e.g., patient age and gender), type of insurance coverage, and information regarding services received (e.g., cost, type, and place of service; lengths of stay; procedures performed; laboratory results; medications prescribed). In other cases, providers must abstract clinical data from individual medical charts (referred to as chart data) that currently are not retrievable electronically. Chart data include information such as results of diagnostic tests and procedures, medications, and therapeutic procedures.
There are many trade-offs involved in collecting data from the different sources, such as that between the burden of data collection and the value of the data collected. Collection of admin data requires only sorting of electronic data; thus this method is relatively quick and inexpensive. By contrast, collection of chart data requires that a nurse, physician, or some other certified person with medical knowledge go through each medical chart to abstract the data. This is not only time-consuming, but also costly. With regard to the importance of the data collected, admin data frequently do not adequately capture specific clinical information (e.g., whether cholesterol levels were controlled to less than 100 mg/dL) in the absence of electronic laboratory data. The latter can be found in some health plans and medical groups, but must otherwise be obtained through chart review. Judgments about the relative merits of admin versus chart data also need to take into account the frequency of collection and the accuracy and reliability of the data. Both modes of data collection are used; however, both are limited in the amount of information yielded, as well as the resources required to collect the data.
As pay-for-performance rewards can be based only on data that are collected, the data collection must be timely and capture the intermediate and ultimate outcomes of care while not imposing an undue burden on providers. Data systems are increasingly being designed with the capacity to collect more “meaningful” data electronically; this capability will be greatly accelerated by the adoption of health information technologies (see Chapter 5). The committee carefully weighed these considerations when assessing the types of measures on which to base pay for performance, but be-
lieves that obtaining meaningful data should not be precluded by the burden of data collection. This issue should be reevaluated as pay for performance and the measures on which it is based develop.
Rewarding by Condition
As noted in Chapter 3, the committee found that Medicare payments are focused on patients with specific common chronic conditions: 70 percent of Medicare inpatient spending is associated with the 32 percent of Medicare patients who have chronic heart failure, coronary artery disease, diabetes, or a combination of these conditions (MedPAC, 2005).2 This finding suggests that initially, with respect to physician services, concentrating pay for performance on conditions (especially these three) would be practical.
There are two further reasons for pursuing a condition-based reward system. First, the measures in the starter set are often organized by condition and specific setting. Second, a provider-based system, the primary alternative, would not be practical because research has shown that providers who are high performers on one condition do not necessarily perform well on others (Jha et al., 2005). For example, a hospital can rank among the best for cardiac care and still provide suboptimal care for pneumonia patients.
It is important to note that the reward pools for various conditions would likely differ because, for example, improvements in one condition may be more difficult to achieve than those in others, the value of improvements may be greater for some conditions than for others, and the differences among providers may be less extreme for some conditions.
Rewarding Composite Measures of Care for Conditions
The committee believes measures should be bundled into composites for a given condition for each provider. The Performance Measurement report (IOM, 2006) proposed that measures for specific conditions be combined into a single composite per patient that would reflect whether the patient had received the minimum level of critical care required to treat a condition. For example, the individual measures of HbA1c management, HbA1c control, blood pressure management, lipid measurement, LDL cholesterol level, and eye exam would be grouped into a single composite for diabetes. To receive any aggregate reward, providers would have to exceed threshold levels for each measure. The Performance Measurement report also proposed that research be carried out to determine how each measure
within a composite should be weighted. Measures of performance are often collected and reported by aggregating all the patients a provider treats, not by determining the total amount of care each individual patient received, because data are not currently adequate to characterize performance at the latter level (IOM, 2006).
Rewarding by Structure
Another option is to distribute rewards on the basis of structural measures of care. This approach might involve rewarding providers that have in place such structures as clinical care teams, care coordinators, and health information technology systems that are thought to improve the overall safety, timeliness, and efficiency of care. Some structural measures related to information technology, such as computerized provider order entry, intensive care unit intensivists, and evidence-based hospital referrals, are included in the starter set identified in the Performance Measurement report. Other structural measures, such as those for care coordination and nursing staff hours, are still largely under development (IOM, 2006). When ready, structural measures may provide incentives to create the infrastructure necessary to close current gaps in care. Rewarding by structure would be a promising way to foster more comprehensive systematic change in the health care delivery system.
HOW TO DISTRIBUTE REWARDS
Absolute Versus Relative Performance
Improvement and excellence can be rewarded on the basis of meeting predetermined, absolute levels of performance or performing well relative to other providers.
Absolute thresholds would establish minimum levels of performance providers would have to meet to be eligible for a reward. The use of absolute thresholds would allow providers to invest their resources with specific aims, as opposed to striving to achieve what would otherwise be an unknown moving target. A potential drawback of this approach is that if the provider community had an extremely high level of achievement, the reward pool would be distributed among a larger group of providers, reducing the amount of the individual rewards received. However, having a large proportion of providers meet thresholds would not be a negative outcome. If desired, a system could be designed to vary the size of the reward pool with the number of
providers who performed well. In fact, this would be necessary if providers were to be certain of their reward for achieving the preset performance goals.
Another option is a tournament-style or relative rewards structure, whereby rewards would be given to groups of providers who achieved a predetermined percentile ranking (e.g., 90th percentile), as opposed to a predetermined rank against measures as would be the case if absolute thresholds were used. Under this method, the best providers compared with their peers would be rewarded, and the situation in which everyone was above average—which can occur with absolute thresholds—could not arise. Because providers would be competing with each other and not attempting to achieve absolute thresholds, some lower performers might think that rewards were unattainable. On the other hand, this method could induce healthy competition among the best providers, potentially resulting in greater improvements than a system that failed to promote cutting-edge competition. Conversely, attempts to improve performance beyond a certain threshold (for example, from 98 percent to 100 percent) might produce diminishing returns and waste resources. Tournament-style rewards could also reward mediocre performance if the overall distribution of performance were low. For example, if the distribution of scores were low, providers in the 90th percentile might actually be delivering good care only 40 percent of the time, as defined by performance measures. The high level of uncertainty as to the amount of the rewards providers might receive could also be a disadvantage of this method because it might make providers hesitant to invest in quality improvement. In addition, a tournament-style reward system would limit the number of providers to whom rewards could be distributed. It could thus be viewed as disadvantageous to providers who believed improvements would be more difficult to make relative to other providers with whom they would be compared.
How High to Set Thresholds
There is little evidence that can be used for determining how high or low thresholds should be set in health care pay-for-performance programs to provide the most powerful incentive to improve. Given this lack of evidence, the committee proposes that CMS determine threshold levels. Clearly, thresholds should be set on the basis of clinical evidence and consensus as to what constitutes high-level performance and is reasonable to expect from providers. These levels of performance or percentiles for eligibility for rewards must be set in a timely fashion so providers can plan their quality improvement interventions in advance. Thresholds should also be
constantly reviewed and set higher as long as average performance improves and higher levels of achievement are possible.
Graduated or Fixed Rewards
Distribution of rewards could be either graduated or fixed. In a graduated system, rewards would increase based on the amount of improvement or level of performance. With fixed rewards, the same amount would be distributed to all providers performing at or exceeding predetermined thresholds; below those thresholds, no rewards would be given. Either option could be used with both absolute and tournament-style thresholds, as well as with rewards based on both improvement and excellence. Absolute thresholds are used in the discussion here for simplicity; rewards based on both improvement and excellence are discussed for each option.
A graduated system for improvement could, for example, require a minimum 30 percent reduction in failure rate for a reward. To illustrate, Bloomfield Home for the Elderly (a hypothetical site) would receive $30 for improving from its baseline performance of 50 percent to 65 percent and $50 for improving to 85 percent, with scaled rewards in between that need not be linearly related to improvement (see Figure 4-1). If rewards were provided for excellence with a threshold of at least 65 percent, Bloomfield Home would receive $30 for its performance at 65 percent even if its baseline performance were 75 percent.
It is important to note that the maximum reward need not be provided for performance at 100 percent or a reduction in the failure rate of 100 percent. Under some measures, an increase from 90 to 100 percent might have only a marginal impact on health outcomes or might require investments not worth the benefits.
If rewards were fixed, Bloomfield Home would receive the same reward if its failure rate were reduced by 30 percent (from 50 to 65 percent) or by 90 percent (from 50 to 95 percent), as depicted in Figure 4-2. Below that fixed level, no rewards would be granted.
If fixed rewards were provided for performance, Bloomfield Home would receive $50 whether its performance were 85 or 99 percent. It would receive nothing if its performance were 84.9 percent, as depicted in Figure 4-3.
A combination of fixed and graduated rewards might also be appropriate when linked to clinical conditions associated with the measure. For example, fixed rewards could be used for conditions for which there are continuous clinical benefits associated with additional gains (e.g., reduced postsurgical infection rates). On the other hand, graduated rewards could be used in cases where the additional clinical benefits diminish after a certain threshold is attained (e.g., HbA1c management). In conclusion, the committee found that variable rewards would offer the advantage of motivating larger numbers of providers while giving the largest rewards to those whose performance was best and might be used more frequently than fixed rewards, if deemed appropriate.
Penalties for Lack of Improvement
To this point, rewards have been discussed on the basis of performing well (excellence) and upgrading performance within a given time period (improvement). What has not been discussed is what happens if a provider not only continues to perform poorly, but in fact allows performance to deteriorate. This issue must be considered in designing a pay-for-performance program.
One option is to penalize providers who exhibit the worst performance or the least effort to improve. The size of the penalties could be determined in a manner similar to that for rewards. The penalty amounts could be returned to the Medicare trust funds, CMS could distribute the
penalty money to high-performing providers as additional rewards, or the penalty money could be held in an escrow account that poor-quality providers could earn back upon improved performance, among other options.
It is important to note that the committee recommends funding pay for performance initially out of base payments, so that all Medicare providers would contribute to the reward pool (see Chapter 3 for a discussion of the reward pool and Chapter 5 for a discussion of participation in pay for performance). Therefore, those who consistently performed poorly would pay penalties in addition to the reduction in their base payments.
A system with penalties would create stronger incentives for good, or at least adequate, performance and continued attention to improved performance. However, such a system could generate considerable resistance among providers. Providers who were not confident of their ability to improve might refuse to participate.
An alternative would be not to impose penalties on providers with very poor or deteriorating performance. The argument in favor of this approach is that providers would already be experiencing reductions in their base payments and that this should give them incentive enough to improve. Moreover, seriously deficient providers should probably be either removed from Medicare participation or be required to engage in a quality improvement program managed by their local Quality Improvement Organization.
Definition of Comparison Groups for the Purpose of Determining Rewards for Providers
Pay-for-performance rewards would be distributed within groups of providers. These groups can be defined according to a number of characteristics that are not mutually exclusive, such as procedure or service, setting of care, specialty, and location. Certain procedures and types of care are provided in a variety of settings. For example, minor surgery might be provided in outpatient hospital departments, ambulatory surgical centers, or physician offices; post–acute care might be provided by a skilled nursing facility unit in an acute care hospital, a free-standing skilled nursing facility, or a home health agency. The comparison group for pay for performance could be all those who provide a particular service or all those who provide the service in similar settings or have the same training. While comparisons across all care settings or specialties would appear to be the most equitable approach, comparable measures may not be available for all settings and specialties. Differentiation across settings, however, could prove to be problematic because providers might seek to define the comparison group very narrowly. For example, some might argue that comparisons should be only
among hospitals of a certain size (e.g., over versus under 200 beds) or type (e.g., teaching versus nonteaching).
For a program such as Medicare, which operates throughout a nation in which market conditions, practice patterns, and other circumstances vary considerably from one locality to another, the geographic scope of a pay-for-performance program becomes an issue. Central is the question of whether comparisons should be made across the nation or regionally when rewarding excellence. If providers were compared nationally, the system would ensure that high-quality care was defined in a uniform way across the country. Furthermore, national standards preclude the inequity inherent in rewarding top performers in a region with low average performance while denying rewards to providers with better performance who happen to be located in regions with high overall performance. National comparisons might undercut support for a pay-for-performance program if few providers in certain regions received rewards. However, comparisons that focused only on a subset of the country might not adequately address disparities in performance arising from regional variations in practice patterns. If significant regional differences (such as those found in the Dartmouth Atlas project for acute myocardial infarction and chronic disease care) are found to exist in the performance being measured, CMS should consider a blend of regional and national comparison groups. Over time, however, a uniform national comparison should be used.
Models for Distribution
As discussed previously, rewards should be focused on the three domains of performance by setting of care and by condition. The next question for consideration is how the reward pools for improvement and excellence should be allocated among these domains. There are many feasible options for distribution, three of which are discussed below.
For both improvement and excellence, each of the three domains could be given equal weight through a simple point system such as that shown below, which provides the maximum possible points possible for each category:
Models of Distribution
Under this example, a specific dollar amount or share of the reward pool would be attached to each point. Provider payments under this system would, therefore, be based on performing well in any of the three domains. Such a system would provide rewards to providers with inconsistent performance—for example, those delivering clinically high-quality care very inefficiently. Those who had the greatest improvement and highest level of excellence across all domains would receive the largest rewards.
An alternative system might require that a minimum threshold level of excellence be reached in one, several, or all domains for a provider to receive any points in the other domains. For example, a provider might have to score at or above the 50th percentile on clinical quality to receive points for efficiency. This option could ensure that rewards for high-quality care would not be given to those with excessive resource use. Such a system would create a considerably smaller pool of eligible providers, especially if there were thresholds for each domain.
The final option would be to allocate the points in option 1 unequally. This system would reflect practical considerations, such as the fact that measures are less available and robust in some domains than others and that in the early years, one might want to emphasize improvement more than excellence to motivate the most providers possible and counteract pushback from low-performing geographic regions. Over time, the allocation of points could be changed to reflect national health care priorities and views on the most pressing areas for improvement.
Little evidence exists to support one of the above options over another; the committee recognizes that the choice of which system to use involves many value-laden decisions. However, the committee also recognizes that not enough efficiency measures are currently available for this domain to have equal weight with the others, even though efficiency is critical when a payment system is being restructured to reward value. The committee therefore believes option 3 is currently the most viable for an initial pay-for-performance program.
Distribution Between Parts A and B
As described previously, Medicare has different components—Parts A, B, C, and D—that cover and pay for services delivered by different types of providers. Along with how the reward pools for each part are to be developed, distribution to each part must be considered. Once measures have been developed to enable rewards on the basis of episodes or health outcomes, mechanisms will have to be devised for determining the amounts each component should contribute to the reward pool. Similarly, mechanisms will have to be developed for dividing the rewards among all providers who contributed to the high-quality performance. In certain situations, such as inpatient hospital care, that challenge will have to be addressed upon the introduction of any pay-for-performance program because the performance of the hospital will represent the efforts both of the hospital and of physicians and other professionals who are not employees of the hospital. For specific discussion of each of the above settings, see Appendixes A and E.
Payout per Provider
Given the above design issues, this section describes how a provider might expect to be rewarded. The example in Box 4-1 is just one possible method for rewarding physicians based on the principles articulated in this chapter. It assumes that rewards would be allocated based on (1) provider type (e.g., hospital, physician, home health agency), (2) condition, (3) a blended approach rewarding both improvement and excellence, and (4) use of absolute thresholds.
The example in Box 4-1 includes many of the committee’s views on how rewards should be distributed. It is important to note that this is a simplified example for ease of understanding; the actual design of payout per provider would be much more nuanced, dealing with multiple physicians treating a single patient’s medical condition. The committee believes a pay-for-performance program should not reward providers merely on a per service basis. In the above example, payment is provided per patient per condition. To address the issue of volume (e.g., those physicians treating only 5 diabetics being compared with those treating 20), a standard maximum reward per patient should be paid for each condition. The payment scale described in this example necessarily involves many value-laden decisions. The rate of increase in rewards based on weight does not necessarily have to be linear. The improvement, excellence, and payment scales should all be reassessed periodically and readjusted before each payout to account for changes in performance. This example also focuses only on clinical qual-
Example of How a Physician in an Ambulatory Setting Could Be Rewarded on Clinical Quality
Dr. Roller is an internist with approximately 1500 patients, many of whom are covered by Medicare. He treats his Medicare patients for a variety of conditions, including coronary artery disease, chronic heart failure, diabetes, and pneumonia. In this example, Dr. Roller’s reward is determined through the calculation of composite scores for selected conditions. Points are assigned for both improvement and excellence. These points are then associated with dollar amounts to create Dr. Roller’s total bonus payment.
For his patients with coronary artery disease, Dr. Roller is evaluated by Medicare on how well he performs on the following clinical quality measures: drug therapy for lowering LDL cholesterol, beta-blocker treatment after heart attack, and persistent beta-blocker treatment following myocardial infarction.
These measures (M1–3) are used to form a composite score (CCAD). One method, among many for calculating this score, averages the percentages for each measure. Thus if Dr. Roller scores 79 percent, 89 percent, and 71 percent on these three measures, respectively, his composite will be:
In the previous year, Dr. Roller’s composite score for coronary artery disease was 72 percent. His improvement over the previous year is calculated using the reduction in failure rate (RFR)*:
Rewarding Improvement and Excellence
Dr. Roller’s composite score is then compared with thresholds for improvement and excellence for each condition, set by CMS. These points can be combined for a total score. For coronary artery disease, the threshold to be eligible for rewards on improvement could be 20 percent,
and the threshold for excellence could be 80 percent. The points achieved for improvement could be scaled:
For excellence, the following type of scale could be used:
Dr. Roller’s points for improvement and excellence are both 0.10. His care of patients with coronary artery disease receives a total of 0.20 points.
Similar calculations are used for Dr. Roller’s patients with diabetes, chronic heart failure, and pneumonia. These composites could be equally weighted and combined in the following manner to determine the overall reward he receives:
The payment scale should reflect the following considerations:
In this system, the provider is rewarded more for each additional patient he sees with a targeted condition.
ity in the ambulatory setting. Similar models could be developed for rewarding measures of both patient-centeredness and efficiency whereby thresholds would have to be met for a provider to receive rewards for improvement and excellence. Payments in each domain of measures would be aggregated to determine the total amount of a provider’s reward. Comparable methods could be developed to award high performance in other care settings, although each setting has unique characteristics that must be accounted for (see Appendix B). Regardless of how pay for performance is designed, it must be transparent and understood by all stakeholders, especially providers and purchasers.
HOW LARGE REWARDS MUST BE
A critical question to be addressed in a pay-for-performance program is how large rewards must be in order to influence provider behavior. There is little evidence regarding the necessary magnitude of rewards (see Chapter 3 for some examples). It is likely that the threshold reward size will vary depending on provider type (e.g., institutional versus individual providers), area of improvement (e.g., conditions, measures, types of improvement interventions), and the percentage of the provider’s revenue that is affected by the performance incentives. At the same time, there are constraints in that the amount of the rewards available must be found within Medicare, a program with limited resources. Rewards therefore must be reasonable enough to influence provider behaviors while remaining within the confines of a strict budget. In addition, it is worth noting that payment incentives will be accompanied by public dissemination of performance data, which may prove to be an even more powerful motivator for improving overall quality.
Pay for performance uses incentives to encourage providers to improve. Therefore, if the potential rewards are not large enough to cover the costs associated with improvement, providers may not believe the investment to be worthwhile. If providers are not able to recoup their investment, they may not support the program and decide not to participate (see Chapter 5). By contrast, if the size of rewards is optimized to retain physician buy-in and change provider behaviors, it may offer enough incentive for providers to invest heavily in improving their performance.
The lack of significant evidence for how an optimal national pay-for-performance program should be designed led the committee to consider many options for distributing rewards to providers. If pay for performance is implemented in Medicare, the committee believes certain principles should
apply, such as the importance of rewarding multiple domains of care and the need to reward both improvement and excellence. The design characteristics described in this chapter are general examples that can be adapted to fit the needs of the program with respect to its overarching goals, as defined by CMS.
The next chapter discusses several practical issues to be considered when developing and implementing pay for performance. The committee was able to make firm recommendations on some of these issues, whereas for others, the evidence base supports only careful presentation of options. These issues include the following:
The timing of pay for performance and its precursors: what steps need to occur before rewards can be provided on the basis of measures of performance.
The overall timing of implementation: when pay for performance can begin in each care setting.
The nature of participation: what providers will be eligible for pay for performance in Medicare and whether the program should be voluntary or mandatory.
The unit of analysis: to whom rewards will be distributed (i.e., the individual physician, medical groups, hospitals, skilled nursing facilities).
The role of health information technology: how new technologies can influence the implementation of pay for performance.
Statistical issues: sample size, problems surrounding risk adjustment, and precision.
Bradley EH, Herrin J, Elbel B, McNamara RL, Magid DJ, Nallamothu BK, Wang Y, Normand S-LT, Spertus JA, Krumholz HM. 2006. Hospital quality for acute myocardial infarction: Correlation among process measures and relationship with short-term mortality. Journal of the American Medical Association 296(1):72–78.
IOM (Institute of Medicine). 2001. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academy Press.
IOM. 2006. Performance Measurement: Accelerating Improvement. Washington, DC: The National Academies Press.
Jha AK, Li Z, Orav EJ, Epstein AM. 2005. Care in U.S. hospitals: The Hospital Quality Alliance Program. New England Journal of Medicine 353(3):265–274.
MedPAC (Medicare Payment Advisory Commission). 2005. MedPAC Data Runs. Washington, DC: MedPAC.