Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 353
Kidney Failure and the Federal Government D Survival Analysis Methods for the End-Stage Renal Disease (ESRD) Program of Medicare ROBERT A. WOLFE, Ph.D.* This document addresses two specific areas related to survival analysis of Medicare ESRD data. The first focus is a critical review of methods of survival analysis for Medicare ESRD data, including methods used in the past by HCFA and the USRDS, methods that have been proposed by other members of the renal community, and methods that are potentially useful for future analyses. The second focus is a review of the results of international comparisons of mortality rates with the objective of determining what conclusions can be drawn from such comparisons. While this paper gives a critical review of methods of analysis of ESRD mortality data, it does not report the results of any new analyses of empirical data. Instead, this paper is intended to help in the interpretation of the results of previous data analyses and to give directions for future analyses and data collection that address some of the limitations of the research carried out to date. Although a review of statistical methods must unavoidably involve some degree of abstraction, I have tried to tie abstract concepts to specific issues whenever possible. The analysis of mortality among ESRD patients is complicated by the nature of the data available for analysis and by the heterogeneity of the patients receiving therapy. These issues are addressed in some detail in this paper in order to show how to avoid potential limitations of analyses and errors in their interpretation. Some of the specific issues are listed below. The data for ESRD patients are complicated because they are collected over time and involve a sequence of events for each subject being studied. The sequence of relevant medical information for some of the subjects in * Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
OCR for page 354
Kidney Failure and the Federal Government the data set may be incompletely documented. For example, patient followup of younger ESRD patients is not typically tracked in the Medicare data system until 90 days after first therapy for ESRD. Analyses of such incomplete data are susceptible to subtle forms of bias. Patients with treated ESRD exhibit a wide variety of characteristics that influence mortality patterns. An evaluation of the importance of any one of these characteristics must also account for the potential impact on the results of the other characteristics. The Medicare data collection system was designed primarily for reimbursement purposes and, consequently, has some limitations for research purposes. Some patient characteristics related to mortality, such as previous medical history, are not regularly recorded in it. Other characteristics, such as patient treatment history, are derived from billing records rather than from dedicated data collection instruments and, consequently, are subject to error. Interpretation of the results of mortality analyses is complicated by the variety of analytical methods and types of numerical summaries that can be reported. Analytical methods include adjusted and unadjusted results, cross-tabulations and multiple regression models, parametric and nonparametric methods, Cox models and logistic regression models, and other methods discussed below. The results of statistical analyses can be summarized as death rates, death proportions, mortality ratios, and expected lifetimes. An overview of several crucial issues central to the analysis of mortality data is presented in the first section as a series of questions. Each question is followed by some of the issues that should be addressed when answering the question. These issues recur in more specific forms in subsequent sections of the document. In the second section, general strategies for adjusting statistical analyses for patient characteristics are discussed. Patient characteristics that are currently measured or that would be useful to measure are examined, and two approaches toward adjusting statistical analyses for patient characteristics are provided. In the third section, several methods of survival analysis that are relevant to ESRD data are reviewed with the intent of showing how to compare, interpret, and synthesize the results of survival analyses. Most of the methods reviewed in this section have been used, or proposed for use, by other members of the renal research community. Each method has qualities that make it appropriate for specific purposes. Some proposals are also made in this section for analysis methods that have not yet been widely used in renal research. In addition, several different numerical parameters that are used to summarize the results of survival analyses are discussed. The fourth section reviews several problems associated with the interpre-
OCR for page 355
Kidney Failure and the Federal Government tation of international comparisons of mortality rates. Although this section focuses on the analysis recently reported by Held et al. (1990), the issues are largely relevant to any international comparisons. GENERAL ISSUES IN SURVIVAL ANALYSIS Overview Survival analysis of ESRD patient data can help to identify factors, such as etiology, that are related to differences in patient mortality. For example, a comparison of the one-year survival proportions for diabetic and nondiabetic patients shows that mortality rates differ by etiology. With the identification of such factors, survival analysis also yields estimates of the magnitude of the differences in survival associated with those factors. Although the comparison of mortality figures for two groups of patients gives a direct evaluation of the importance of the factor distinguishing the two groups, it seldom leads to a complete understanding of the mechanisms causing the difference. Since any two groups of patients may differ with respect to several factors simultaneously, it is of some interest to determine how much effect any one factor would have on mortality, if all other factors were held constant. For example, the average age among diabetic and nondiabetic patients is different, so we want to know by how much the mortality rates would differ for diabetic and nondiabetic patients, if the ages were similar in the two groups. Survival analysis can help to answer hypothetical questions such as ''By how much would the mortality patterns in two groups of patients differ, if they were to differ from each other with respect to only one characteristic at a time?" Although much this appendix describes examples related to the study of mortality differences for several treatment groups, the concepts and statistical methods that are presented apply equally well to the comparison of outcomes for groups of patients defined by other characteristics, as well. Survival analysis can only yield results concerning factors that are measured and recorded in the available data. Unfortunately, some of the most important determinants of survival may not be recorded in the Medicare data base. Survival analysis cannot account for the potential effect of unmeasured or unmeasurable factors on patient mortality. Many of the important questions related to policy and scientific research involve factors that have not or cannot be measured. In such cases, expert opinion blended with indirect or imperfect evidence must be relied upon. Although decisions that are based on inconclusive evidence can be wrong, the available data still should be evaluated and weighed carefully when making policy decisions. Sir Ronald Fisher, one of the great statisticians and scientists of this century, argued throughout much of his life that there was no definitive
OCR for page 356
Kidney Failure and the Federal Government evidence showing that tobacco smoking causes lung cancer. He argued that there could be an underlying factor that caused certain individuals both to smoke and to be more susceptible to lung cancer. Although his reasoning was correct (the evidence is not conclusive unless a randomized controlled clinical trial is performed), the resulting policy decision based on his reasoning would undoubtedly have been a poor one. Statistical analysis is just one tool used in the weighing of empirical evidence. The process involves the formulation of a question, the assembling of relevant data to address the question, and careful interpretation of the results of analysis of the data. An evaluation of the strengths and weaknesses of the conclusions derived from the analysis sometimes leads to a reformulation of the question, collection of new data, or reanalysis of the data. The choice of appropriate method of statistical analysis cannot be discussed in isolation. Just as the specific question that we want to answer determines the way in which we try to answer it, the choice of appropriate statistical methodology depends strongly upon the specific purposes that motivate the analysis. Examples Several major types of research objectives and questions can be addressed through the collection and analysis of national ESRD data. The examples listed below were selected to highlight specific issues in the appropriate interpretation of survival analysis results. Some of the examples are based on hypothetical or simplistic situations but illustrate fundamental issues that arise in more realistic situations, as well. The examples show that statistical comparisons are central to the interpretation of statistical analyses. Reduction and avoidance of bias, through the selection of appropriate comparison groups and the control of confounding factors, are fundamentally important to most analytic research. Minimizing and evaluating the impact of random variability on the results of research is also important. The selection of methods that yield interpretable numerical summaries is an essential aspect of the dissemination of results. Identification of the Study Population What is the death rate among ESRD patients? The death rate among untreated ESRD patients is very high; with no kidney function, they will surely die within days. Data are available in the Medicare system for treated ESRD patients after they become eligible for Medicare payments. However, victims with undiagnosed kidney disease are not counted in the Medicare data base. Further, because of eligibility requirements, the first 90 days of ESRD therapy for many younger patients in the United States are
OCR for page 357
Kidney Failure and the Federal Government not captured by the Medicare data system. Among the elderly, the fraction of deaths attributable to withdrawal from therapy can be substantial (10 percent or more), but these deaths are included in most reports of ESRD mortality. Even among treated ESRD patients, the length of survival of patients with some residual kidney function is likely to depend upon the amount of impairment and the rate of progression of the disease. If the deaths of never-treated ESRD victims were included in reported mortality rates, then the death rates likely would be elevated above those currently reported. In contrast, if the deaths among those withdrawing from treatment were excluded, then the death rates would likely be substantially lower than those currently reported. Different definitions distinguishing between reduced kidney function and ESRD could also have a substantial effect on reported death rates among the ESRD population. Thus, any evaluation of mortality rates among ESRD patients must identify which patient population is being considered. Without such identification, it is difficult to compare different mortality results. Evaluation of the death rate in a group can depend strongly upon who is included in the group and which period of patient follow-up is included in the evaluation. The Importance of a Comparison Group Are mortality rates among treated ESRD patients very high? Mortality rates among dialyzed ESRD patients in the United States are typically 24 percent per year (USRDS, 1990, p. E.31). Although of some value in isolation, this fact is of most interest when compared to other death rates. For example, the 24 percent annual death rate among dialyzed ESRD patients is exceptionally high in comparison to that of a non-ESRD population with the same age distribution (approximately 2 percent per year). However, the discrepancy would be somewhat smaller if the comparison were made to a non-ESRD population with the same history of diabetes and hypertension as is found among incident ESRD patients. This comparison would be especially useful for the evaluation of programs designed to prevent the progression of hypertension and diabetes to ESRD. Although death rates among treated ESRD patients are high relative to those in a healthy population, they are low compared to those among untreated ESRD victims. Furthermore, the comparison of death rates among ESRD patients receiving different forms or amounts of therapy would be useful for comparing the relative efficacy of the various therapies. When evaluating ESRD death rates, it is useful to make comparisons to other death rates.
OCR for page 358
Kidney Failure and the Federal Government Biased Comparisons Spurious Differences Is the death rate for dialyzed ESRD patients higher with continuous ambulatory peritoneal dialysis (CAPD) or with center hemodialysis (CH)? CAPD was preferentially given to insulin-dependent diabetic ESRD patients in some regions of the United States in the early 1980s because it offers a convenient method for administration of insulin to the patient. Because of this, diabetes is more prevalent among CAPD patients than among CH patients in the United States. Thus, even if CAPD and CH were inherently equally efficacious therapies for ESRD (analyses to date have been inconclusive), then the high death rate among diabetic ESRD patients and the higher prevalence of diabetes among CAPD patients would cause unadjusted death rates to be higher among CAPD patients than among CH patients. Comparison of crude (unadjusted for patient characteristics) death rates for CAPD and CH patients would not account for this fact and would erroneously lead to the conclusion that death rates were higher among CAPD patients than among CH patients. Observed differences in outcome may be due entirely to differences in patient characteristics. Biased Comparison and Age-Specific Comparisons and Inappropriate Comparison Group How much lower are death rates among transplant patients than among dialysis patients? The annual death rate is typically 6 percent per year among ESRD patients with transplants (USRDS, 1990, p. E.40) and is typically 24 percent per year among dialysis patients (p. E.32). The death rate among transplant recipients is much lower than among dialyzed ESRD patients. However, transplant recipients are also younger, on average, than dialyzed patients. Comparison of the age-specific 5-year survival probabilities for dialyzed patients and transplant recipients indicates a substantial variation in death rates with the age of the patient (p. E.32, E.40). Thus, a large part of the difference in overall (crude) death rates between transplant recipients and dialyzed patients is due to the difference in ages between the two groups. Part of an observed difference in outcomes may be due to differences in patient characteristics. Comparison of specific mortality rates for homogeneous subgroups of patients yields a less biased evaluation. In addition to the effect of age, which is known and recorded for each patient, other differences between dialyzed patients and transplant recipients may account for part of the difference in death rates for these two treatment modalities. Dialyzed ESRD patients who are on the transplant waiting list are distinguished, in a variety of ways, from dialyzed ESRD patients who are not on the waiting list. Some of these distinctions may be
OCR for page 359
Kidney Failure and the Federal Government difficult to identify or measure. Therefore, in order to compare the death rates of dialyzed patients and transplant recipients, it would be more appropriate to use the survival experience of dialyzed patients who are on the waiting list rather than that of all dialyzed patients. Comparison groups should be similar in all aspects other than the characteristic being studied, especially in those other aspects that cannot be measured. Bias Due to Unobserved Factors How can a new therapy for ESRD patients be evaluated without bias? The ultimate evaluation of a treatment protocol should be based upon how well it works and how much it costs in comparison to other protocols. Specific outcomes (mortality is an overriding outcome) can be selected as the basis for comparison of two protocols, and the outcomes can be evaluated for two series of patients treated according to the two protocols. Quantitative comparison of the outcomes for the two protocols leads to conclusions about the size of the difference in patient outcomes, which then can be evaluated in comparison to the relative costs of the protocols. An observed difference in outcomes cannot be ascribed to the difference in treatments unless the two patient groups are equivalent in other major regards. The interpretation of the differences should account for any preexisting differences between the two groups of patients. Ideally, evaluation of a new treatment should be based on the comparison of patient outcomes under the new and old treatment regimens, all else equal. This ideal can never be realized because it would require two identical series of patients for study. In practice, during patient enrollment, the two treatment groups can be deliberately balanced for important factors that are likely to have substantial effects on patient outcome. However, in order to ensure balance with respect to unknown factors, the only practical solution is randomization (Campbell and Stanley, 1963). Randomization can be used to control for unforeseen differences between patient groups. The controlled randomized clinical trial offers the only study design that can be guaranteed (with high probability) to be free of bias. All other study designs are subject to potential bias because the groups being compared may differ with respect to several factors that affect patient outcome. Interpreting Standard Errors for Population Data When there is no sampling error because a statistic is reported for all ESRD patients, how should the standard error be interpreted? The standard error does not reflect uncertainty about the specific population being described, since the statistic precisely summarizes the experience of the whole population. However, results from apparently similar populations do vary
OCR for page 360
Kidney Failure and the Federal Government from each other, and this variability is reflected in the standard error. For example, even in an apparently stable population, the number of deaths varies from day to day. There is a corresponding variability in the number of events that occur each year as well. The amount of variability in the number of certain types of events, such as ESRD incidence and death, is closely approximated by the Poisson distribution. The standard error of the statistic reflects the amount of variability that typically occurs in the value of the statistic upon repeated observations of similar populations. The value of the standard error of a statistic measures the typical amount of variability due to random causes that occurs in the value of the statistic. Accounting for Random Variation Among ESRD patients over age 65, the proportion of patients surviving for 4.75 years decreased from 20.6 percent to 16.3 percent between 1982 and 1983 (USRD, 1989, p. D.21). Does this signify a trend for this age group? Even in otherwise similar populations, incidence counts and death counts vary from year to year apparently because of random variation. A calculation using the standard errors (USRDS, 1989 p. D.22) indicates that, even in two stable populations with identical death rates, a difference in survival proportions bigger than 4.3 percent would be likely to occur frequently just by chance. The expression P > .20 indicates that a difference as large as that observed could occur between successive years more than 20 percent of the time, just by chance, for two groups of patients with identical death rates. When a difference could plausibly occur by random chance, the difference is called insignificant. A difference that would not likely occur by chance is called a significant difference. The probability that a difference as big as or bigger than the one observed could occur by chance is called the P-value of the difference. (P-values less than 5 percent are labeled as significant whereas those greater than 5 percent are labeled as insignificant.) Using this criterion, the difference reported above is insignificant. The reported difference could represent a trend, or it could represent random variability. In this case, the change was noted for a single age group between two consecutive years. On the basis of those limited data, there is no way to know whether it is a trend or a chance occurrence. However, if the trend persisted in subsequent years, or was also seen in other age groups, then the P-value could be recomputed taking into account all the evidence. If the resulting P-value was small, then it would be unlikely that the difference had occurred by chance, so it would be more plausible that the difference represented a true trend. If the evidence from other age groups and years was inconsistent with the trend noted previously, then the recomputed P-value would tend to remain large and the difference could plausibly be attributed to chance.
OCR for page 361
Kidney Failure and the Federal Government When a large change is noted that can be plausibly ascribed to chance, it is prudent to look for further evidence in order to determine whether the change represents a true trend. If a trend over time has been found to be significant, then the next step would be to try to determine what was causing the trend. Random occurrences can lead to apparently substantial differences in outcomes, especially with a small, narrowly defined study group. Statistical and probabilistic evaluation of the chances of such differences can help to distinguish unimportant random fluctuations from more persistent patterns. Important Versus Significant Is it important that the difference between 5-year survival probabilities of 0.388 for females and 0.411 for males (averaged for 1977–83; see USRDS, 1989, p. D.21) is not likely to have occurred by chance (approximate P < .10)? The difference (2.1 percent) is small relative to the survival probabilities (average 40 percent) and is therefore uninteresting. Even a small difference will be statistically significant if there are enough data documenting the consistency of the difference. In this case, the difference is statistically significant because the sample size is so large (all ESRD data for patients incident between 1977 and 1983). Although the difference reported here is not significant at the 5 percent level (P < .05), it is significant at the 10 percent level (P < .10). Such a difference is often called marginally significant. When a difference is significant, it is appropriate to interpret the difference as a real one rather than one that was likely to have occurred by chance. Once found to be significant, the importance of the difference must be evaluated. The decision process about the importance of a significant difference is largely subjective. A 2 percent difference is small in comparison to other differences that have been found between patient subgroups, but it still represents a substantial number of individuals and consequently might be judged to be important. For example, a treatment change that lengthens life by 3 months for 4,000 patients yields a numerical benefit of 1,000 person-years of extra life. In contrast, special therapy that extends the life of a single person for 10 years is of special significance to the individual involved, but the numerical benefit is 10 person-years of life. A statistically significant result is not always an important one if it is small. However, a small difference can be important if it affects a large number of individuals. Analysis of Provider Versus Patient Suppose that the annual death rate is 15 percent at institution A and 25 percent at institution B, that the difference is significant (P <.001), and that
OCR for page 362
Kidney Failure and the Federal Government the patient characteristics are similar at the two institutions. Institution A reuses dialyzers while institution B does not. Does this prove that all institutions should reuse dialyzers? The statistical significance reported from the analysis was based on the sample size of patients at each institution. Thus, the difference in mortality rates for the two groups of patients is unlikely to have occurred by chance. That is, there is a true difference in mortality between the two institutions. The appropriate generalization is to patients at the two institutions. However, we cannot generalize reliably to other institutions because the sample size of institutions is only two. When the P-value is calculated on the basis of an analysis of the institutional data, with a sample size of 2, the P-value is 0.50, which is not significant. Differences between institutions A and B other than in dialysis reuse may be responsible for the reported difference in death rates (Donner and Donald, 1987. The statistical significance of an analysis based on patients should not be used to make conclusions about the population of providers. Choice of Parameter for Mortality Summaries If one study reports a 50 percent increase in death rates for one group relative to another group, whereas another study reports that the fractions surviving at 12 months differ by only 8.9 percent, which is right? The results might be entirely consistent with each other. Monthly death rates of 2 percent and 3 percent will lead to surviving fractions of 78.7 percent and 69.8 percent, respectively, at 12 months. A death rate of 3 percent is accurately described as being 50 percent higher than a death rate of 2 percent. The difference in surviving fractions is accurately summarized as 8.9 percent. Note that the fractions dead in this example would be 21.3 and 30.2 percent, respectively, corresponding to a nearly 50 percent increase in the fraction dead. Part of the apparent discrepancy between the original reports from the studies was due to the fact that one comparison was made using a ratio whereas the other comparison was made using subtraction. Another, less important, cause of the apparent discrepancy can be illustrated by the analogies of death rates to compound interest rates and of death proportions to simple interest rates. The proportion of individuals, P, surviving through an interval of time is related to the death rate, R, per unit interval by the equation P = exp(-R). For example, if the death rate per year is 20 percent, then the fraction that survives through the year is 81.9 percent [= exp(-0.2)]. The fraction that dies during the year is 18.1 percent, slightly less than 20 percent. The death rate, fraction dead, and fraction surviving are all different ways
OCR for page 363
Kidney Failure and the Federal Government to summarize mortality experience. Each is useful for particular purposes. Proper interpretation of results requires an understanding of the meaning of each. Type I and Type II Error Issues A clinical trial based on 120 patients found no significant difference between two treatment therapies (P > 0.05). Does this prove that the therapies are equally effective? The only conclusion that should be reached from the insignificant P-value is that the difference between treatment groups seen in the clinical study could have arisen by chance. There may be a substantial difference between the therapies, but there were not enough data in the trial to document the difference. The results of a comparative study should include, in addition to P-values, confidence intervals for the values of important outcome parameters. The confidence interval helps in the evaluation of the potential importance of the difference between two groups whereas the P-value tells whether the difference could have arisen by chance. A nonsignificant difference should not be interpreted as a definitive result by itself. Confidence intervals for the size of the difference are more useful for interpretation. If the confidence interval includes large differences, then the true difference might also be large. If the confidence interval includes only small differences, then the true difference is likely to be small. Projections and Extrapolations The one-year surviving fraction among cadaveric transplant recipients has been increasing every year since 1977 (USRDS, 1989, p. E. 19). Can this trend be extrapolated to give an estimate for 1990? Extrapolations and projections of trends are very susceptible to bias because circumstances can change over time. Projections work well if the nature of the process that is being predicted does not change over time. The usual standard errors reported in a statistical analysis reflect only the uncertainty due to random fluctuations and measurement error, not the uncertainty due to bias or to change in the nature of the problem. Thus, the trend can be extrapolated but there is no way to evaluate the accuracy of the projection. Extrapolations and projections of trends are very susceptible to error. Accuracy of Counts The counts of incident ESRD patients reported by the USRDS for 1987 (USRDS, 1989, pp. A. 1, A. 11) do not agree. How can any data analysis from the Medicare data system be trusted?
OCR for page 390
Kidney Failure and the Federal Government Fully Parametric Models Certain specific parametric models have proved useful for answering specific research questions about survival patterns. The major limitation of such models is that they start with the assumption that a particular type of equation is correct for the population being studied. If the assumption is (approximately) correct, then the conclusions based on the model are (approximately) accurate. However, if the assumed equation is incorrectly specified, then the conclusions based on the equation can be inaccurate. Parametric models can sometimes lead to useful qualitative conclusions, even when the quantitative results are inaccurate because the model is incorrectly specified. Parametric statistical models are based on the choice of a particular type of formula or equation that might plausibly approximate the survival pattern in a population. Such equations are often specified by the numerical values of a few parameters, or coefficients. Once the values of the parameters are known, the equation can be used to compute the value of any other characteristic that can be defined in terms of the equation. Fully parametric models are based on certain assumptions about mortality patterns which may or may not be true for the ESRD population; thus, the resulting analyses may or may not be appropriate. Nonparametric or semiparametric models are based on fewer assumptions than are fully parametric models, and the results of nonparametric analyses are correspondingly less likely to be biased or incorrect. However, if an appropriate model is used, the results of parametric analyses tend to be more precise than the results of nonparametric analyses. Exponential Model The exponential model is based on the assumption that the death rate for a group of ESRD patients does not change with time. If this assumption is correct, then the survival curve for the patients is an exponential function of time and the curve is specified by one parameter: the death rate per unit of time. On the basis of the value of this death rate, other values can be computed, including the median lifetime, the expected lifetime, the probability of surviving for 5 years, and so on. However, the exponential model is known to be a poor approximator of the long-term survival pattern for people because death rates rise with age. Further, the death rate among ESRD patients tends to decrease with the number of years since first ESRD therapy. Thus, average lifetimes that are calculated on the basis of an assumed exponential distribution are likely to be inaccurate if the calculated lifetime spans a large age or time range. Weibull Model This model yields a better approximation of mortality patterns among ESRD patients than does the exponential model. The Weibull model has two parameters that must be estimated from the data, in addition to regression coefficients that relate mortality rates to patient characteris-
OCR for page 391
Kidney Failure and the Federal Government tics. These two parameters allow the Weibull model to fit a variety of patterns of mortality. The model yields useful and interpretable summaries of death rates, survival functions, relative death rates, and so on. Estimates of the model can be derived easily using the SAS statistical package. It is not as easy to use the Weibull model with time-varying patient characteristics as it is to use the Cox or Poisson regression models, because the standard statistical packages have not been extended to allow time-dependent covariates with the Weibull model. One danger in the use of this model, or any other parametric model, is that the model can be estimated on the basis of a short period of patient follow-up and then extrapolated to yield estimates of long-term survival. There is no way to check the assumptions that the model makes for long-term survival on the basis of short-term data, and consequently there is no way to be assured that the long-term extrapolations are correct. The more nonparametric Poisson and Cox regression models naturally limit their predictions to the intervals of time for which data are available and thus are less subject to the abuse of extrapolation. Bailey-Makeham Model The Bailey-Makeham parametric model has proved to be very useful for qualitatively distinguishing between predictors of long-and short-term survival. If the model can be shown to yield a good approximation to the survival distribution for ESRD patients, then it may prove to be a particularly important analytical tool. The Bailey-Makeham model is less widely implemented on computers than are the Weibull, Poisson, and Cox models. Although the Bailey-Makeham model can be used to answer the same variety of analytic questions as can other models, it is especially attractive for its ability to quantify the differential effect of patient characteristics on long-and short-term patient survival. Prevalent Versus Incident Cohort Analyses In addition to the selection of an appropriate statistical methodology, it is crucial to select the appropriate group of patients to be included in the analysis. The selection depends strongly upon the objectives of the analysis of particular relevance is the distinction between prevalent and incident cohorts of patients. A prevalent cohort includes all patients treated during a specific year, including those whose therapy started prior to that year. An incident cohort includes only patients whose therapy started during a specific year. Many of the analyses performed to date have been limited to one or the other type of study group. Analysis of successive prevalent cohorts is most relevant to detecting trends in mortality over calendar time. Analysis of incident cohorts is more appropriate if the objective is to characterize how mortality changes with the time since first ESRD therapy.
OCR for page 392
Kidney Failure and the Federal Government For the purposes of summarizing patient survival after ESRD therapy starts, it is appropriate to classify patients by the year in which their ESRD therapy started. Such classification accounts for the changes in acceptance patterns that might occur over time. By definition, the baseline characteristics of a patient accepted into the Medicare program for ESRD do not change subsequent to acceptance. In such a classification, each patient is in just one cohort. The USRDS Annual Data Report (1990) has reported such summaries. If the objective of analysis is to evaluate changes in therapy that occur with calendar year, possibly in association with changing technology or with program administration, then it is more useful to classify patient follow-up according to the year in which the therapy occurs. A patient contributes information on death rates in the prevalent cohort during each of the years that the patient is treated. Further, a patient is potentially in each of several prevalent cohorts in such a classification. Analysis can be performed either with Poisson regression models or with a Cox model using time-dependent covariates or strata. The analysis is conceptually similar to a series of annual analyses of 1-year survival for all prevalent (new and continuing) ESRD patients in each year. Eggers (various years) has reported such series in some of the HCFA reports. In order to evaluate the simultaneous effects of both year of incidence and year of treatment, the statistical model used must incorporate both time measures. Tabulation of death rates according to patient characteristic and according to year of incidence and year of treatment would not be useful because the number of cells to be examined would be too large and the data would be too sparse. Frailty All of the statistical methods discussed above can account for measured patient characteristics and can summarize the relationship between patient characteristics and mortality. However, none of the models can directly account for patient characteristics that are not measured. There are several unmeasured patient characteristics that are related to mortality, and patients who are at higher risk for these unmeasured characteristics will tend to die sooner than patients who are at lower risk. The ensemble of unmeasured patient characteristics has been given the name frailty in the statistical literature (Vaupel et al., 1985), and frailty is known to affect the estimation of death rates. The selection process tends to lead to apparently lower death rates as time goes on because the less frail individuals are those that survive. In counterbalance to the unmeasurable effect of decreasing frailty are the measurable effects of increasing age as time goes on. Several statistical methods are currently being developed to account for frailty.
OCR for page 393
Kidney Failure and the Federal Government Treatment Modality One important objective of survival analysis in the ESRD program is the evaluation of treatment modalities for ESRD. Such analysis is complicated because clinical trials are not commonly used for the evaluation of treatment therapies. Instead, treatment therapies are selected through a highly subjective set of decisions that involve both the patient and the provider of care. Because of this process, it is likely that different therapies have different profiles of patients assigned to them. Thus, any differences noted in patient outcomes could be due either to the different therapies or to differences in patient characteristics. In order to reach more definitive conclusions, information is needed about the condition of each patient at the time of each therapy change. Using patient condition data collected at the start of each therapy, statistical methods could be used to yield adjusted measures of patient outcomes for a specific therapy. Further, the patient condition at the start of a therapy change could act as a measure of patient outcome for the previous therapy. It is impractical and unnecessary to collect such detailed data for a census of the ESRD patient population. Instead, statistically valid samples could be drawn from the population of ESRD patients. Such samples could be drawn either prospectively, from newly incident cases, or retrospectively, from patients who have already received ESRD therapy. Prospective samples would be necessary if data collection were to include measures that are not readily available in the medical records. Retrospective samples could be drawn if data collection were to be limited to information that was readily available in existing records. Publication of Standard Death Rates The USRDS has started to publish mortality rates that can be used for small data base research. Death rates among prevalent ESRD patients in 1988 have been calculated for each major age-race-disease group classification (USRDS, 1989, Table D.31). These national rates can be used to compute the expected number of deaths for any study group. The ratio of the observed to the expected number of deaths can then be used to evaluate the mortality rates for the study group. Methods for such calculations are reviewed in detail by Breslow and Day (1982, 1987). The rates published in the USRDS Annual Data Report are currently limited to the prevalent cohort of ESRD patients at the start of 1988. If continued for successive years, these rates will be useful for comparing death rates in small study groups to the expected rates based on national data. In addition to the patient characteristics of age, race, and disease group given in the published rates, it would be of value to extend the list of patient character-
OCR for page 394
Kidney Failure and the Federal Government istics so that more precise comparisons could be made. Other patient characteristics could include gender, comorbidity, treatment, year of current prescription, and year of first prescription. Institutional Characteristics Analyses of patient-specific characteristics should be distinguished from analyses of facility-specific characteristics. The effective sample size for patient-specific analyses is related to the number of patients whereas the sample size for facility-specific analyses is the number of facilities studied. These two different types of analyses typically are addressed with different methods. Analysis of facility-specific outcomes should be based on a single observation per facility, as discussed by Cornfield (1978). The type of institution or treatment protocol at an institution may affect mortality rates. In order to study such relationships, the institution rather than the patient is the unit of analysis (assuming that all patients at an institution receive the same treatment). Factors that could be or could have been studied on the basis of institutional analyses are profit-nonprofit status, dialyzer reuse, length of dialysis, and transplant technique. Internal and External Standardization The statistical analyses described here are based on the concept of comparison. For example, the death rates for two groups of patients can be compared. The mortality rate for one group of patients can be compared to that from another group in the same study (internal comparison) or to published mortality rates for another population (external comparison). Death rates published by the USRDS could serve as an external standard of comparison for a series of patients from a small study. In a larger study, there may be sufficient numbers of patients in several patient subgroups that their mortality rates can be usefully compared to each other. Generally, internal comparisons are more valid than external comparisons, because bias is less likely to be a problem. However, external comparisons can provide indications of trends that may be useful for qualitative comparisons. Analyses that involve an external comparison or standard may prove useful in understanding the effect of ESRD on mortality rates. ESRD patients with an etiology of hypertension could be compared to the general population with hypertension. Similar comparisons could be made for diabetes. For example, ESRD patients with an etiology of AIDS may have much higher mortality rates than do patients with other etiologies. However, therapy for ESRD may prove to be just as useful in extending the lifetime of ESRD AIDS patients relative to expected lifetimes among non-ESRD AIDS patients, as it is for diabetic ESRD patients relative to non-ESRD diabetic patients.
OCR for page 395
Kidney Failure and the Federal Government Some of the specific models described previously allow internal as well as external comparisons of mortality rates to be made simultaneously. International comparisons of mortality among ESRD patients are a form of external comparison in which data from very disparate sources are evaluated. Some of the uses and limitations of such comparisons are discussed in the next section. INTERNATIONAL COMPARISONS International comparisons of mortality rates have two major objectives. The first is to document the existence of differences in mortality rates, if they exist. The second is to identify the reasons for the differences, if they exist. Using the currently available data, it is difficult to arrive at a definitive answer to the first objective and it is impossible to arrive at an answer to the second. The current data can give indications, but not proof, of differences in mortality rates for otherwise similar patients from different nations. Expert opinion can then be sought regarding hypotheses about causes of any differences that are thought to exist. With the current system of separate registries, international comparisons can serve, at best, to point out somewhat crude differences in mortality patterns. Since only rough adjustments are possible across registries, there is no feasible way to isolate the reasons for any observed differences among nations. The most recent and comprehensive international comparisons of mortality rates have been reported recently by Held et al. (1990). Many of the comments below are specifically motivated by the Held report but are also relevant to the interpretations of any international comparisons. Most of the limitations of international comparisons described below were recognized and acknowledged by Held report but are reviewed here in more detail. The results in the Held report are intriguing and give some indication that ESRD mortality rates are substantially lower in other nations than they are in the United States. Specific hypotheses generated by the Held comparison should be evaluated in more detail in order to determine whether a cause-and-effect explanation can be found for the differences that were found in that study. In addition, mortality rates in the United States should be closely monitored for trends over time. Limitations Many of the issues relevant to the use of survival analysis techniques for the analysis of U.S. data are also relevant to the international comparison of mortality rates. However, the problems are compounded in international comparisons because the data bases often have not been analyzed in a consistent way, the data often have not been collected in a consistent way, and
OCR for page 396
Kidney Failure and the Federal Government there are almost certainly differences in the characteristics of patients from different nations that are not measured in the data bases. Adjustment for confounding factors is more problematical with international comparisons than it is with national analyses because the patient-specific data are not available in a unified structure. In order to use statistical analysis of international data to determine the reasons for differences in mortality rates, it will be necessary to measure the potential causes of any differences at the national level and to correlate those measures with adjusted mortality rates across the nations. The level of mortality observed in a national registry is strongly influenced by the criteria for acceptance into the registry. Different acceptance criteria, whether part of stated policy or influenced by the individuals who implement the policies, can have dramatic effects on mortality rates. If only healthy patients are accepted into a treatment program, then mortality rates will tend to be lower than if patients with high levels of comorbidity are accepted into the program. Different rates of diagnosis and treatment of ESRD among various nations gives some indication that acceptance criteria differ among nations, although the direction of the bias, if any, that such differences would cause is unknown. After patient age, the most important patient characteristic for predicting patient mortality may well be comorbidity, which is not recorded at the national level in the United States except for primary diagnosis. Since comorbidity is not currently recorded in the registries, it cannot be currently determined whether differences in patient morbidity are a likely cause of differences in patient mortality. There are known differences between the types of patient accepted into the U.S. and other ESRD treatment programs. The importance of such differences is documented by the experience in the U.S. alone. The acceptance rate into the ESRD program in the U.S. has increased dramatically over recent years, with the result that the treated ESRD population is substantially older and has many more diabetic patients than it did previously. This has led to an increase in the crude mortality rate in the United States since 1977 (USRDS, 1990, Tables E.10, E.12, and E.14). However, death rates adjusted for age, race, sex, and primary diagnosis have been relatively stable during the same period (USRDS, 1990, Tables E.53, E.55, and E.57). Post hoc adjustments to international comparisons of mortality rates can also be made for known patient characteristics, such as age and etiology, but such adjustments are likely to be less accurate than would be a unified analysis of the combined data from several nations. Etiology The adjustments made by Held et al. (1990) to international comparisons have partially accounted for national differences in age and frequency of
OCR for page 397
Kidney Failure and the Federal Government diabetes in the ESRD population. A more complete adjustment for diabetes would involve information about both the type of diabetes and the respective mortality rates among the non-ESRD diabetic patients in the nations being compared. The frequency of type of diabetes is not accounted for in the current adjustments and may differ across national boundaries. Differences in the management of diabetes may lead to different mortality rates among diabetics from different nations, even if they do not have ESRD. Other aspects of etiology may also be important. For example, in the United States, death rates are elevated relative to glomerulonephritis if the etiology is hypertension. Since cardiovascular disease is less prevalent in Japan than in the United States, it may also be important to adjust for hypertension. It is instructive to consider the difference between mortality rates of black patients and white patients as an example of the amount of variability that has been seen in the United States. The 5-year survival probabilities, adjusted for age, gender, and primary diagnosis, for black ESRD dialysis patients and white ESRD dialysis patients incident in 1984 are 36.5 and 30.5, respectively (USRDS, 1990, p. E.73). This unexplained difference of over 6 percentage points in survival probabilities is smaller than some of those reported by Held et al. (1990) for international comparisons, but it indicates that substantial differences can exist between identifiable groups, even within the same data collection system and nation. Other recent analyses (Wolfe et al., 1990) have shown that the difference between the mortality rates of blacks and whites is most substantial for diabetic patients and hypertensive patients, indicating that the impact of these two etiologies can vary substantially across different groups of patients. The existence of substantial differences in mortality between two groups of patients in the United States makes it clear that large differences in mortality rates among nations can be expected. Age The adjustment made by Held et al. (1990) for age is in 10-year (Europe) or 15-year (Japan) age groups. These are wide age intervals because the 5-year survival probability decreases dramatically with age after age 20 (USRDS, 1990, p. E. 14). Differences of just a few years in the average age of patients in corresponding age groups would cause a substantial difference in the mortality rates for the groups. If patients in one nation are older overall, then they will tend to be older in each age category as well. For example, even if the age-specific death rates were identical in two nations, but the average ages in corresponding age groups were 5 years higher in one nation than in the other, then the two nations would have age-adjusted survival proportions that differed by approximately 5 percent. (The 5-year death fraction decreases by approximately 1 percent per year of age; see USRDS,
OCR for page 398
Kidney Failure and the Federal Government 1990, p. E. 14.) Age differences between patients in the different nations are thus plausibly responsible for at least a portion of the difference in mortality noted by Held et al. (1990). In addition to factors that are measured across national registries, there are likely to be substantial differences between patients with respect to other characteristics, including past medical history, comorbidity, distance from a treatment center, and level of kidney function at first treatment. These factors cannot be adjusted for with the current data; they would require special studies, but it would be difficult to evaluate the potential impact of such unmeasured factors. However, evaluation of geographic differences in mortality in the United States would give an indication of the amount of variability present nationally that could be compared to the differences seen internationally. Withdrawal Rates The rate of withdrawal from therapy among dialysis patients in the United States is not negligible. Port and colleagues (1989) have reported that up to 10 percent of deaths among elderly patients in Michigan follow soon after withdrawal from therapy. Furthermore, at least 8.6 percent of all ESRD deaths in the United States in 1987 can be attributed to withdrawal from therapy (P. Eggers, HCFA, personal communication, 1990). There are large differences in withdrawal rates between groups in the United States, and it is plausible that large cross-national differences in withdrawal rates might also exist. Withdrawal may be a particularly relevant issue in international comparisons because the largest international differences in mortality were reported by Held et al. (1990) in the nonpediatric age groups, the same age range in which withdrawal is common in the United States. Patient Follow-up Ascertainment of mortality status by the ESRD data system is largely complete because of the computer links to the Social Security System. Although patients with long-lived transplants may be temporarily lost to the Medicare data collection system, their deaths are recorded when they occur so that overall mortality rates can be accurately estimated. It would be useful to have information from other nations concerning the fraction of the ESRD population that are followed to eventual mortality. Directions for Further Research Although the international comparisons in death rates that are reported by Held et al. (1990) indicate that mortality may be higher in the United
OCR for page 399
Kidney Failure and the Federal Government States than in some other peer nations, such differences are plausibly attributable to different data collection methods, differences in patient comorbidity and health practices, and differences in patient compliance. However, although international comparisons are not definitive, they still indicate that differences exist, for some unknown reasons. There are several areas of research that could be profitably explored to better our understanding of international comparisons: The impact of differential death rates in the general population has been partially addressed by Held et al. (1990). Further study of differential mortality rates in the populations of diabetics and hypertensives from various nations may also be useful. The fact that the differences in death rates among nations are largest in the nonpediatric age groups helps give some focus to the search for the reasons for such differences. Further identification of subgroups with differential death rates may help clarify the reasons for differences. Comparison of multivariable models from different nations would be an efficient method for such studies, and cause of death could be a useful measure. More international communication on the methods of managing data registries could prove useful for all nations that attempt to maintain ESRD data registries. For example, methods for validation of data registries could be standardized. Careful evaluation of different patterns of treatment methods among nations would be useful. Currently, much of the data on treatment patterns are derived from expert opinion rather than through data collection. Differences in patient compliance should be studied in order to determine the effect of withdrawal rates. REFERENCES Becker RA, et al. 1988. The New S Language, Wadsworth, Pacific Grove, CA. Box GEP, Jenkins GM. 1976. Time Series Analysis, Holden-Day, Oakland, CA. Breslow NE, Day NE. 1982 and 1987. Statistical Methods in Cancer Research, Vol I and II, Oxford University Press, Oxford. Campbell DT, Stanley JC. 1963. Experimental and Quasi-Experimental Designs for Research, Rand McNally College Publishing Co., Chicago. Cornfield J. 1978. Randomization by group: A formal analysis. Am J Epidemiology 108:100–102. Cox DR. 1972. Regression models and life tables, JRSSB 34:187–220. Cox DR, Oakes D. 1984. Analysis of Survival Data, Chapman and Hall, London. Dixon WJ, et al. 1985. BMDP Statistical Software, University of California Press , Berkeley. Donner A, Donald A. 1987. Analysis of data arising from a stratified design with the cluster as unit of randomization, Statist Med 6:43–52. Eggers P. (HCFA). 1990. Personal communication concerning withdrawal rates. Health Care Financing Research Report (P. Eggers) End Stage Renal Disease HCFA, various years.
OCR for page 400
Kidney Failure and the Federal Government Held PJ, et al. 1990. Five-year survival for end stage renal disease patients in the U.S., Europe, and Japan, Am J Kid Dis 15:451–457. Kalbfleisch JD, Prentice RL. 1980. The Statistical Analysis of Failure Time Data, New York, Wiley. Lawless JF. 1982. Statistical Models and Methods for Lifetime Data, New York, Wiley. Payne CD. 1987. The GLIM System Release 3.77, Royal Statistical Society, NAG, Downers Grove, IL. Port FK, Wolfe RA, Hawthorne VM, Ferguson CW. 1989. Discontinuation of dialysis therapy as a cause of death, Am J Nephrol 9:145–149. SAS Institute. 1988. SAS/STAT User's Guide, Release 6.03, SAS Institute Inc, Cary, NC. USRDS (U.S. Renal Data System). 1989. Annual Data Report. National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD. USRDS. 1990. Annual Data Report. National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD. Vaupel JW, et al. 1985. Heterogeneity's ruses: Some surprising effects of selection on population dynamics, Am Statist 39:176–185. Wolfe RA, Port FK, Hawthorne WM, Guire, KE. 1990. A comparison of survival among dialytic therapies of choice: In-center hemodialysis versus continuous ambulatory peritoneal dialysis at home. Am J Kidney Dis 15:433–440.
Representative terms from entire chapter: