Appendix B
Methods for the Analysis of Associations of
Quality Measures with Payments in Chapter 3
DATA AND VARIABLES
The analyses reported in Chapter 3 used data from two sources to generate the 16 measures listed in Table 3-2 of the body of the report. Medicare Consumer Assessment of Healthcare Providers and Systems (CAHPS) data were drawn from the fee-for-service (traditional Medicare) arm of the 2010 Medicare CAHPS survey. The sample of 275,000 noninstitutionalized beneficiaries with at least 6 months of continuous enrollment in traditional Medicare was stratified by state. Sampled beneficiaries were sent a prenotice letter followed by a mail survey, and initial nonrespondents were followed up with a replacement survey and up to 20 telephone contact attempts, with a final response rate of 58 percent. The content, goals, and use of this survey have been described elsewhere (Goldstein et al., 2001).
CAHPS measures in our analyses included 10 individual items from this survey and one composite of four highly correlated items on doctor communication. These are listed in Table 3-2, ordered as access/timeliness measures, experiences with care, and clinical quality. All items are top-coded so the mean represents the fraction selecting the most favorable response option, when there are more than two options (for all items, in the composite), with the exception of “seen within 15 minutes,” which combines the top two categories owing to its lower distribution. Results are presented in terms of percentages. Most of these items are among those used in public reporting of CAHPS data. The number of usable responses varies across items due to the use of screeners and skip patterns to ensure that only beneficiaries who had relevant experience during the 6-month reference period of the survey answered each item; access items were screened by items asking whether a service was ever needed during that period, and quality items by screeners for having used the service in question.
In addition to these measures, seven CAHPS items were used as adjusters for individual characteristics that are typically associated with measures, consistent with models used in comparative public reporting of CAHPS data (Zaslavsky et al., 2001). These were self-reported education, general and mental health status, and assistance by a proxy in completing the questionnaire,
as well as age, Medicaid dual eligibility, and low-income supplement eligibility from Centers for Medicare & Medicaid Services (CMS) databases. These categorical variables contributed 22 dichotomous variables to the models.
Five additional measures assessed provision of guideline-recommended clinical care, following specifications of the Health Plan Employer Data and Information System (HEDIS). These were constructed from a 20 percent sample of fee-for-service Medicare claims for 2009, and included breast cancer screening and recommended testing for cardiac patients and diabetics. These measures were adjusted only for the patient’s sex.
The key predictors of interest were defined at the county level. Rurality was represented by the 2003 version of the Rural—Urban Continuum Code (RUCC) with levels from 0 (central cities of largest metropolitan areas) to 9 (least densely populated rural areas), with odd numbers representing areas not adjacent to a more urban area.1 The Health Professional Shortage Area (HPSA) designation is used to adjust Medicare physician fees at the ZIP code level.2 For this analysis, the committee coded this information by ZIP code into a five-category county-level variable that captures the percentage of the county’s population in HPSA ZIP codes: none (0 percent); between 0 and 20 percent; from 20 to 80 percent; over 80 but not 100 percent; or 100 percent (full-county HPSA). Both the physician practice geographical adjustment factor (GAF) and the physician work geographical practice cost index (GPCI) were used in separate models. The GAF drives geographical variation in total payments to practices for a given service mix, while the work GPCI is the component of the GAF specifically addressing the cost of physician labor; thus the two measures encompass the role of physicians both as operators of businesses and as workers. In most models including GAF or GPCI, the current CMS value for 2010 (excluding frontier floors) is used, on the assumption that this is best related (among current measures) to the historical experience of geographical adjustments potentially affecting the counties. The proposed Institute of Medicine (IOM) factors are used only in calculating the differences representing the potential effects on payment of shifting to the IOM committee’s method.
MODEL SPECIFICATION
For each of 16 measures, 13 models were fitted. A baseline model included only the individual-level adjuster variables as regressors. Four models each added a single county-level variable (RUCC, HPSA, GAF, or GPCI) to the model, to assess their distinct associations with the measures. Four additional models combined a descriptive geographical variable (RUCC or HPSA) with a payment factor (GAF or GPCI) to assess whether one acted as a mediator for the other. Finally, four models entered the difference between proposed IOM and current CMS factors (either GAF or GPCI, with or without additional control for RUCC) to assess the relative impact of the change in method on payment in higher- and lower-performing areas.
All models were specified as multilevel random-effects linear models, reflecting geographical clustering of quality variations. We compared models with a single level of clustering (county) and two levels (county nested within state). For most models there was significant evidence (deviance >3.84) in favor of the latter model, which accounts for clustering at larger as well as smaller scales. We therefore used this specification for all models. Significance of fixed effects was assessed with Wald tests using the robust Huber variance estimator. Across all 143 CAHPS models (13 model specifications for 11 measures), case-mix effects for the two health variables
______________
1See http://www.ers.usda.gov/briefing/rurality/ruralurbcon/priordescription.htm.
2Also see discussion in Appendix A regarding translation of ZIP codes to county-level data.
were significant in every model, and effects of age, education, proxy response were significant in between 104 and 130 models (F test, P <.05).
Univariate Models
Adjusted associations of the GAF with all of the CAHPS measures were negative (although only two were significant), as were 8 out of 11 associations of GPCI with CAHPS measures, all significant (Table B-5, models PFC, PIC). Associations of both GAF and GPCI with cholesterol screening for both cardiac patients and diabetics were positive, although significant only for GAF. These coefficients should be interpreted in relation to the range of these adjustment factors. Multiplying the largest estimated coefficient (39.75 for wait <15 minutes) by the interquartile range (IQR) of county GPCI yields a predicted difference of 39.75 × (1.017 − 0.973) = 1.75 percentage points; similarly, multiplying the largest GAF coefficient (32.98, also for wait <15 minutes) by the IQR yields a predicted difference of 32.98 × (1.032 − 0.938) = 3.10 percentage points. Effects extrapolated over the full range of the payment factors would be about four times as large. Thus, higher payment indices were generally associated with worse scores for patient-reported measures, but with better cholesterol screening.
Global F tests of single geographic variables, relative to the base model, were significant (P <.05) for five measures for HPSA status (have a personal doctor, wait <15 minutes, both immunization measures, breast cancer screening) (Table B-1). The most consistent HPSA effects were for comparison of the 80-100 percent HPSA counties to the (baseline) 0-20 percent HPSA counties, with a mean 1 percentage point lower score for the former group, while non-HPSA (0 percent) counties had slightly higher scores than the non-HPSA group.
Patterns for RUCC were more variable and complex given the greater number of categories. Significant variation across the 10 categories was indicated by the F test for 12 out of 16 measures (Table B-2), including the five measures with significant variation by HPSA category and also specialist appointment, rating of care, doctor communication composite, and the remaining HEDIS items. We used a linear regression of coefficients for each measure, and the mean across measures, on the RUCC (for RUCC 1 to 9) as an ad hoc summary of the trend in beneficiary-reported quality from the most urbanized to the most rural areas. This summary for the mean suggested a weak trend toward lower quality in the more rural areas (Table B-4, first column). The trend in the mean, however, conceals opposite trends in ratings of care and doctor, timely routine care, and doctor communication, which are slightly higher in the more rural areas, and in the other measures, most notably getting needed care and immunizations, which trend lower in more rural counties.
ASSESSING THE MEDIATING ROLE OF PAYMENT RATES
We assessed mediation by adding controls for GPCI or GAF to models for RUCC or HPSA effects. If quality differences among areas along the urbanicity/rurality dimension were due to the effects of differences in the corresponding payment factors, we would expect that controlling for payment factors would reduce the variations across the corresponding categories; a similar argument could be made for HPSA categories.
HPSA coefficients without payment factor controls are compared to those with either GAF or GPCI controls in Table B-3, which shows the standard deviation of coefficients (representing adjusted differences). This measure increases and decreases with GAF or GPCI control for
approximately equal numbers of measures, but on the average is almost the same after GPCI control for either of the adjustment factor variables; this is opposite to the effect that might be expected under the hypothesis that adjustment factors mediated HPSA effects.
RUCC coefficients without payment factor controls are compared to those with either GAF or GPCI controls in Table B-4. The left-hand columns compare the slopes across RUCC categories in the three models, to examine ordered trends. These do not show a reduction in trends relating measures to rurality; on the contrary, the results are quite mixed, but the magnitude of the changes is generally small. The right-hand columns quantify the variation among RUCC categories, without regard to order, by the standard deviation of the 10 RUCC coefficients. The magnitude of variation increases for the majority of measures when the GAF or GPCI variable is added to the model. As with the HPSA comparisons, these are opposite the effects that might be predicted under a mediation hypothesis.
Looking at mediation from the opposite perspective, we considered whether RUCC or HPSA is a mediator of the paradoxical GAF and GPCI effects. As shown in Table B-5, controlling for RUCC or HPSA status does not remove the negative associations of GAF or GPCI with quality reports. Indeed, more often than not it strengthened the associations, especially for GPCI. After control for RUCC, GPCI was significantly and negatively associated with 14 of our 16 measures. Effect sizes for a one-IQR difference in GPCI, relative to total county variation, exceeded one- half for three measures (ratings of doctor and care, and pneumonia immunization). Like our previous results for HPSA and RUCC coefficients, this was opposite to predictions that might be expected under a mediation hypothesis.
ASSSOCIATIONS OF QUALITY MEASURES WITH
PROPOSED CHANGES IN GAF OR GPCI
We supplemented our impact analysis by examining associations of the proposed changes in payment factors with beneficiary-reported quality assessments. As shown in Table B-6, these coefficients are mixed in sign and mostly not significant. Indeed, for the clinical measures (immunizations and HEDIS), the coefficients are mostly opposite in sign for GAF (all positive) and GPCI (mostly negative). After controlling for RUCC, only four coefficients for GAF and none for GPCI remain significant. Furthermore, the IQR of the GAF or GPCI changes is only about one-third that of current levels, further attenuating estimates of the differences in current quality associated with projected changes in the cost index.
Thus, this analysis provides little evidence to suggest that the committee’s proposed revisions would systematically favor areas now experiencing either superior or inferior performance.
REFERENCES
Goldstein, E., P. D. Cleary, K. M. Langwell, A. M. Zaslavsky, and A. Heller. 2001. Medicare managed care CAHPS: A tool for performance improvement. Health Care Financing Review 22(3):101-107.
Zaslavsky, A. M., L. B. Zaborski, L. Ding, J. A. Shaul, M. J. Cioffi, and P. D. Cleary. 2001. Adjusting performance measures to ensure equitable plan comparisons. Health Care Financing Review22(3):109-126.
TABLE B-1 Coefficients of HPSA Category Dummies
County Percent HPSA | |||||
Measure | 0% | 20% to 80% | 80% to <100% | 100% | F-test P-value |
Have personal doctor | 0.09 | -1.87- | -1.84+ | -1.88 | 0.1235+ |
Timely routine care | 1.11* | -0.02 | -0.25 | -1.86 | 0.1165 |
Timely care in illness | 1.40 | 4.06- | 1.93 | 0.45 | 0.0245 |
Wait <15 minutes | 0.78 | 0.17 | -0.91 | -5.76+ | 0.8929- |
Easy specialist appointment | 0.22 | -1.37 | -0.07 | -0.30 | 0.1097 |
Rating of care overall | 0.00 | 1.39 | -2.15- | 1.91 | 0.6379 |
Rating of doctor | 0.32 | 1.25 | -1.10 | 2.58 | 0.5376 |
Doctor communication | 0.78 | 0.44 | 1.03 | 3.02 | 0.3147 |
Get needed care | -1.07 | 1.04 | -1.00 | 1.08 | 0.0004 |
Influenza immunization | 0.15 | -2.51- | -3.56+ | -1.60 | 0.01 10# |
Pneumovax immunization | -0.07 | -4.03+ | -2.12 | -1.84 | 0.0000* |
Breast cancer screening | -0.26 | -0.85 | -3.060 | -2.72* | 0.1864S |
Cholesterol screening (cardiac) | 0.50- | -0.15 | -0.20 | -0.72 | 0.8783 |
Cholesterol screening (diabetic) | 0.21 | 0.23 | -0.20 | -0.85 | 0.4139 |
Hemoglobin Ale test (diabetic) | -0.16 | -0.33 | -0.56 | -1.57 | 0.0936 |
Retinal exam (diabetic) | -0.12 | -1.48+ | -0.84 | -0.19 | 0.2715 |
Mean of all items | 0.24 | -0.25 | -0.93 | -0.64 | 0.1235 |
NOTES: Coefficients are in percentage point units and are relative to baseline category, HPSA coverage >0 percent but <20 percent. Model PH includes only casemix adjusters and HPSA dummies. Significance indications: *= p <.05; + = p <.01; # = p <.001. HPSA = Health Professional Shortage Area
TABLE B-2 Coefficients of RUCC Category Dummies
Rural—Urban Continuum Code | ||||||||||
Measure | RUCC1 | RUCC2 | RUCC3 | RUCC4 | RUCC5 | RUCC6 | RUCC7 | RUCC8 | RUCC9 | F-test P-value |
Have personal doctor | 0.76 | 0.41 | 0.03 | -0.69 | 0.09 | -1.21* | -1.89* | -3.07* | -4.00+ | 0.0003# |
Timely routine care | -1.66 | -0.79 | -1.49 | -3.12+ | -1.42 | -0.69 | -2.66* | -3.15 | 0.05 | 0.1527 |
Timely care in illness | 1.83 | 1.51 | 1.28 | -1.42 | 1.39 | 2.97 | -0.80 | -1.58 | 4.62 | 0.6027 |
Wait <15 minutes | 1.03 | 3.96+ | 2.96 | -0.33 | 3.92 | 2.35 | 0.29 | 0.50 | 1.01 | 0.0006* |
Easy specialist appointment | 2.01 | -1.89* | -0.56 | -0.14 | -0.06 | 0.67 | -1.48 | -3.96 | -0.74 | 0.0222* |
Rating of care overall | -2.09 | 0.75 | 0.74 | -0.61 | 0.66 | 1.40 | -1.49 | -4.04 | 4.34* | 0.0002* |
Rating of doctor | -1.64 | -1.52 | -0.53 | -2.52* | -2.47 | -0.28 | -0.72 | -2.15 | 1.28 | 0.2304 |
Doctor communication | 0.95 | -0.70 | -0.23 | -0.69 | 1.55 | 2.26* | 1.86 | -2.64 | 6.63+ | 0.0042+ |
Get needed care | 3.24+ | 0.98 | 2.03 | 1.55 | 0.73 | 2.79+ | 1.29 | -2.13 | -0.67 | 0.1577 |
influenza immunization | -0.30 | 0.62 | 0.40 | -2.05 | -1.11 | -3.36# | -2.61* | -5.19+ | -1.54 | 0.0006# |
Pneumovax immunization | 2.08 | 2.24* | 1.87* | -0.24 | 1.18 | -2.32 | -1.12 | -3.17 | -1.76 | 0.0001# |
Breast cancer screening | -0.82 | 0.54 | 1.45* | -0.02 | 0.04 | -1.95+ | -1.36 | -2.24* | 0.03 | 0.0000# |
Cholesterol screening (cardiac) | 0.26 | -0.86 | -1.17+ | -0.58 | -2.53# | -1.41 + | -2.39# | -1.56* | -2.48* | 0.0006# |
Cholesterol screening (diabetic) | 0.69 | -0.45 | -.18+ | -1.52+ | -2.71 + | -1.73# | -2.39# | -1.38* | -2.15+ | 0.0000# |
Hemoglobin Ale test (diabetic) | 0.23 | -0.12 | 0.14 | -0.75 | -0.48 | -0.70 | -0.69 | -0.35 | 0.02 | 0.0057+ |
Retinal exam (diabetic) | -1.17* | 0.96 | 0.63 | -0.08 | -0.70 | -1.37+ | -0.80 | 1.23 | 1.04 | 0.0000# |
Mean of all items | 0.34 | 0.35 | 0.40 | -0.83 | -0.12 | -0.16 | -1.06 | -2.18 | 0.35 | 0.0736 |
NOTES: Coefficients are in percentage point units and are relative to baseline category, HPSA coverage >0 percent but <20 percent. Model PR includes only case-mix adjusters and RUCC dummies. Significance indications: * = p <.05; + = p <.01; # = p <.001. RUCC = Rural—Urban Continuum Code.
Standard Deviation of Coefficients | |||
Measure | PH (HPSA only) | PIH (& GPCI) | PFR (& GAF) |
Have personal doctor | 1.047 | 1.072 | 1.101 |
Timely routine care | 1.067 | 1.039 | 1.075 |
Timely care in illness | 1.587 | 1.565 | 1.569 |
Wait <l 5 minutes | 2.649 | 2.853 | 2.725 |
Easy specialist appointment | 0.626 | 0.688 | 0.659 |
Rating of care overall | 1.576 | 1.603 | 1.612 |
Rating of doctor | 1.382 | 1.369 | 1.382 |
Doctor communication | 1.164 | 1.101 | 1.143 |
Get needed care | 1.049 | 0.967 | 0.999 |
Influenza immunization | 1.600 | 1.602 | 1.747 |
Pneumovax immunization | 1.669 | 1.808 | 1.856 |
Breast cancer screening | 1.420 | 1.403 | 1.431 |
Cholesterol screening (cardiac) | 0.439 | 0.254 | 0.385 |
Cholesterol screening (diabetic) | 0.442 | 0.341 | 0.434 |
Hemoglobin Ale test (diabetic) | 0.620 | 0.594 | 0.626 |
Retinal exam (diabetic) | 0.624 | 0.631 | 0.618 |
Mean of all items | 1.185 | 1.181 | 1.210 |
NOTES: Entries summarize the variation among HPSA categories by the standard deviation of category coefficients, in percentage point units. GAF = geographic adjustment factor; GPCI = geographic practice cost index; HPSA = Health Professional Shortage Area.
Slope from RUCC=1 to RUCC=9 | Standard Deviation of Coefficients | |||||
Measure | (RUCC only) | PIR(& GPCI) | PIR(& GAF) | PR (RUCC only) | PIR(& GPCI) | PFR(& GAF) |
Have personal doctor | -0.352 | -0.383 | -0.338 | 1.581 | 1.885 | 1.638 |
Timely routine care | 0.089 | 0.068 | 0.097 | 1.181 | 1.318 | 1.252 |
Timely care in illness | -0.034 | -0.036 | -0.035 | 1.972 | 1.970 | 1.977 |
Wait <l5 minutes | -0.254 | -0.298 | -0.211 | 1.608 | 1.659 | 1.525 |
Easy specialist appointment | -0.184 | -0.232 | -0.180 | 1.600 | 1.960 | 1.685 |
Rating of care overall | 0.110 | 0.080 | 0.112 | 2.251 | 2.202 | 2.225 |
Rating of doctor | 0.205 | 0.157 | 0.205 | 1.224 | 1.342 | 1.228 |
Doctor communication | 0.236 | 0.232 | 0.238 | 2.488 | 2.470 | 2.464 |
Get needed care | -0.365 | -0.409 | -0.352 | 1.610 | 1.864 | 1.648 |
Influenza immunization | -0.269 | -0.305 | -0.246 | 1.845 | 2.198 | 2.019 |
Pneumovax immunization | -0.473 | -0.543 | -0.448 | 1.940 | 2.611 | 2.196 |
Breast cancer screening | -0.1 26 | -0.134 | -0.118 | 1.150 | 1.192 | 1.161 |
Cholesterol screening (cardiac) | -0.121 | -0.092 | -0.145 | 1.003 | 0.635 | 0.912 |
Cholesterol screening (diabetic) | -0.1 26 | -0.112 | -0.136 | 1.082 | 0.867 | 1.047 |
Hemoglobin Ale test (diabetic) | -0.022 | -0.027 | -0.010 | 0.370 | 0.440 | 0.406 |
Retinal exam (diabetic) | 0.068 | 0.054 | 0.064 | 0.960 | 0.962 | 0.961 |
Mean of all items | -0.101 | -0.124 | -0.094 | 1.491 | 1.598 | 1.522 |
NOTES: Entries summarize the variation among RUCC categories by the slope of the trend line across categories 1 to 9 (left 3 columns) and the standard deviation of category coefficients (right three columns), on a scale of percentage points. GAF = geographic adjustment factor; GPCI = geographic practice cost index; RUCC = Rural—Urban Continuum Code.
TABLE B-5 Coefficients of GAF and GPCI, Without Controls and With Controls for HPSA or RUCC
GAF Coefficients | GPCI Coefficients | |||||
Measure | PFC (only casemix controls) | PFH (also controlled for HPSA) | PFR (also controlled for RUCC) | PIC (only casemix controls) | P1H (also controlled for HPSA) | PIR (also controlled for RUCC) |
Have personal doctor | -4.32 | -6.49* | -8.80* | 2.99* | -3.580 | -29.30# |
Timely routine care | 1.18 | -0.90 | -7.62 | 9.56 | 5.19 | -1 8.91# |
Timely care in illness | -4.82 | -5.98 | 0.86 | -11.34# | -12.67# | -1.33# |
Wait <l 5 minutes | -32.98 | -37.49 | -27.02 | -39.75# | -52.32# | -45.04# |
Easy specialist appointment | -3.11 | -4.00 | -8.87 | -9.89# | -12.45# | -36.05# |
Rating of care overall | -7.45 | -8.40* | -7.69 | -18.25# | -20.83# | -21.64# |
Rating of doctor | -0.89 | -1.57 | -7.27 | -7.1 7# | -8.59# | -32.13# |
Doctor communication | -4.86 | -5.85 | -2.58 | -11.77# | -12.59# | -3.14# |
Get needed care | -16.80# | -15.87+ | -13.67- | -33.60# | -31.748 | -35.72# |
Influenza immunization | -8.36 | -11.98 | -18.43+ | 9.04 | -0.23# | -32.44# |
Pneumovax immunization | -20.58* | -23.82+ | -27.65# | -18.16# | -25.76# | -58.53# |
Breast cancer screening | -1.26 | -4.07 | -4.90 | 12.33 | 4.89- | -9.84# |
Cholesterol screening (cardiac) | 18.230 | 17.23# | 14.01+ | 42.36 | 40.83 | 37.16 |
Cholesterol screening (diabetic) | 13.33+ | 12.87+ | 5.84 | 45.09 | 44.84 | 17.13 |
Hemoglobin Ale test (diabetic) | -3.18 | -3.68 | -7.08* | 9.51 | 7.83 | -6.78# |
Retinal exam (diabetic) | 2.94 | 2.19 | 2.38 | -0.72# | -2.97# | -17.67# |
Mean of all items | -4.56 | -6.11 | -7.41 | -1.24 | -5.01 | -18.39 |
NOTES: On scale of percentage points. Significance indications: * = p <.05; + = p <.01; # = p <.001. GAF = geographic adjustment factor; GPCI = geographic practice cost index; HPSA = Health Professional Shortage Area; RUCC = Rural- Urban Continuum Code.
Coefficient of Change in GAF | Coefficient of Change in GPCI | |||
Measure | PFD (only casemix controls) | PFDR (also controlled for RUCC) | PID (only casemix controls) | PIDR (also controlled for RUCC) |
Have personal doctor | 24.85# | 18.26* | -19.70 | 20.53 |
Timely routine care | 13.76 | 8.32 | -12.16 | 19.47 |
Timely care in illness | -6.05 | -6.87 | 24.99 | 36.93 |
Wait <l5 minutes | -0.16 | 0.70 | -8.45 | -21.49 |
Easy specialist appointment | 6.23 | 2.21 | 23.81 | 56.14 |
Rating of care overall | -4.00 | -0.15 | 17.50 | 7.25 |
Rating of doctor | -8.48 | -16.02 | 16.20 | 40.62 |
Doctor communication | -15.25 | -10.67 | 29.52 | 15.33 |
Get needed care | 9.40 | 18.96 | 6.98 | -1.46 |
Influenza immunization | 20.31# | 5.77 | -63.76+ | -23.80 |
Pneumovax immunization | 18.76* | 7.86 | -60.82+ | -31.52 |
Breast cancer screening | 12.34 | 5.62 | -20.19 | 4.80 |
Cholesterol screening (cardiac) | 22.64# | 17.03+ | -40.83# | -19.21 |
Cholesterol screening (diabetic) | 25.51# | 10.66* | -51.59# | -10.19 |
Hemoglobin Ale test (diabetic) | 1 3.69+ | 14.07+ | -18.96' | -6.55 |
Retinal exam (diabetic) | 4.61 | 0.98 | 2.99 | 24.47 |
Mean of all items | 8.64 | 4.80 | -10.90 | 6.96 |
NOTE: On scale of percentage points. Significance indications: * = p <.05; + = p <.01; # = p <.001. CMS = Centers for Medicare & Medicaid Services; GAF = geographic adjustment factor; GPCI = geographic practice cost index; IOM = Institute of Medicine; RUCC = Rural—Urban Continuum Code.