A
Summary of Case Studies
PURPOSE AND SCOPE
As part of the charge from its sponsors, the Institute of Medicine (IOM) Committee to Evaluate Measures of Health Benefits for Environmental, Health, and Safety Regulation was asked to conduct case studies that applied data from completed economic analyses to assess the impacts using different measures of effectiveness. The Committee chose to conduct three case studies that reflect the data and analytic approaches applied by different regulatory agencies as well as the diverse health impacts addressed. This appendix summarizes the case studies, which are described in more detail in three separate reports (Robinson et al., 2005a,b,c). The implications of these case studies for our deliberations are discussed in the main text of this report; some of the key conclusions are also summarized at the end of this appendix.
The case studies were a learning exercise for the Committee. They allowed us to examine in detail the data and methods currently applied by federal agencies when estimating the value of health and safety benefits. These case studies also permitted us to apply alternative quality-adjusted life-year (QALY) methods in the context of regulatory analysis and to examine the outcomes. Because the case studies were completed with limited resources and largely in advance of the Committee’s deliberations, the case studies do not reflect in every respect the best practices ultimately recommended by the Committee, nor were they designed to replicate the
complexity of a full regulatory analysis.1 They do, however, provide a starting point for researchers interested in conducting more sophisticated versions of these types of analyses.
The Committee identified candidates for these case studies as part of a review of all major federal health and safety regulations finalized in recent years (Robinson, 2004). This review focused on those economically significant regulations that were supported by quantitative assessment of both costs and health or safety-related impacts, that is, the types of rules for which new Office of Management and Budget (OMB) guidance (2003a) now requires cost-effectiveness analysis (CEA) in addition to benefit–cost analysis (BCA). Based on this review and discussions with agency staff, we determined that the three rules listed below appeared to best illustrate the range of types of regulations, current practices, and health and safety impacts most likely to be significantly affected by the Committee’s recommendations.
-
The Food and Drug Administration’s (FDA’s) January 2001 juice processing rule: This food safety regulation provides an example of FDA’s use of monetized QALYs to value the impacts of acute and chronic illness in BCA. The health outcomes considered include acute gastrointestinal effects associated with exposure to four foodborne pathogens as well as chronic conditions stemming from these infections. Few cases of mortality were associated with these pathogens.
-
The National Highway Traffic Safety Administration’s (NHTSA’s) March 1999 child restraint rule: Because more recent rules were undergoing revision, we chose a somewhat older rule for the NHTSA case study. However, the data sources and analytic approach are similar to those currently used by NHTSA. NHTSA’s approach to CEA involves converting nonfatal injuries to “equivalent lives saved” (ELS) based on the ratio of their costs to the value of a fatality; these costs include both expenditures and monetized QALY impacts. (See Chapter 2 and Box 2-4 for further detail on the ELS approach.) The health effects addressed by this rule include a variety of fatal and nonfatal crash-related injuries to children.
-
The U.S. Environmental Protection Agency’s (EPA’s) June 2004 nonroad diesel rule: Air pollution regulations account for a substantial
1 |
One of the most important differences between these case studies and the Committee’s recommendations is the limited information they provide on the range of possible values and associated uncertainties. We rely largely on mean or median estimates to assess QALY impacts, and also do not report uncertainties in each agency’s characterization of the health effects averted by the regulations nor in their estimation of regulatory costs. The case studies also do not include detailed information on the distribution and equity of the impacts. In Chapter 4 of this report, however, we use the case studies to illustrate distributive and other concerns. |
-
proportion of all major health and safety regulations finalized in recent years; this was the most recent of these rules. In its BCA, EPA used estimates of willingness to pay (WTP) to value benefits, supplemented by cost-of-illness estimates when suitable WTP values were not available. This case study provided an example of a rule that had several health-related impacts that could not be quantified, as well as both quantified and nonquantified nonhealth effects (e.g., on visibility, crop yields, and other ecosystem functions). The key health effects of concern include preventable mortality and a number of acute and chronic cardiovascular and respiratory conditions.
The following sections provide an overview of the general analytic approach for these case studies. We then discuss the details of the approaches applied in each case and report our results and conclusions. The final section summarizes the major lessons learned from these analyses.
GENERAL APPROACH
To estimate the QALY impacts of the regulations addressed by the case studies, we followed a three-part process.2
-
First, we described each type of injury or illness averted by the rule, based (to the extent possible) on the materials the agency used to support its regulatory analysis.
-
Second, we used several different approaches to estimate the impact of each condition on health-related quality of life (HRQL) over the affected individuals’ lifespans. The methods used varied; each case study involved the application of three or four different approaches.
-
Third, we determined the QALY losses averted by the regulation. This step involved estimating the change in HRQL attributable to the injury or illness under two scenarios: a base case analysis that assumed that affected individuals would be in average health (adjusted for age) over their remaining life expectancy in the absence of the condition of concern, and a sensitivity analysis that assumed that they would be in perfect or optimal health. For nonfatal effects, we then multiplied the resulting decrement by the expected duration of each illness. For preventable mortality, we estimated the change in life expectancy based on the average age of the affected individuals.
This process is illustrated in Figure A-1.
In these case studies, we focused on annual impacts for simplicity and comparability, assessing the change in disease or injury incidence attributable to a single year of the regulatory intervention. If the health effect is chronic or long-lived, however, the new cases of injury or illness prevented each year will have longer term impacts. We take these future year impacts into account and assess the lifetime effects of such cases, calculating the results both discounted and undiscounted. (We follow the discounting guidance in OMB, 2003a, as discussed in the main text of this report.) Agencies’ regulatory analyses generally take a longer view and assess the impacts of the rulemakings over a multiyear period. We believe that this multiyear focus is appropriate; although the presentation of annualized impacts can provide useful information, it should be provided only as a supplement to an analysis that considers the implementation of the rule over a longer time horizon.
Below, we provide an overview of the methods we applied across all three case studies, focusing on the process used to describe the health endpoints and to compare HRQL with and without the condition of concern. In the health care field, “without condition” health (i.e., the health status of an individual in the absence of a particular illness or injury of concern) is often referred to as “baseline” health. We avoid this term because baseline means something different in regulatory analysis; it refers to the situation in the absence of the rule, which is equivalent to “with condition” health status.
Describing Health Endpoints
The first step in the case study analysis involved describing the health endpoints so that they could be valued under alternative HRQL approaches. To increase our understanding of the information typically available to regulatory agencies and for consistency with the agency analyses, we based these descriptions on the same information used by the agency in its risk assessment to the maximum extent possible. Because the original FDA analysis used an HRQL index in its BCA, it supplied most of the information needed for the case study. In contrast, the approach used in the NHTSA rule relied on broad standardized injury classifications that were not adequate for estimating HRQL impacts. Thus we used a different data set to develop descriptions of the injuries averted. For the EPA rule, we relied on a combination of the information provided in the agency’s regulatory analysis and in a separate EPA analysis of the QALY impacts of air pollution-related health effects.
In each case study, we used at least one approach that involved expert assignment of the HRQL attributes for the illnesses or injuries of concern. Developing descriptions for these expert assignments involved several challenges. First, we needed to determine the appropriate level of detail. Our goal was to provide enough information so that medical experts could understand and distinguish between different health endpoints, without overwhelming them with unnecessary information. Our schedule precluded formal pretesting; instead, we consulted informally with individuals with relevant expertise to develop these descriptions.
Second, we wanted to avoid using language in the descriptions that could prejudice the assignment of the attribute levels included in each index (e.g., “little” or “no” difficulty in self-care; “moderate” pain). It was difficult to avoid this language completely, however; in some cases such terminology was part of the description used by the agencies to distinguish between different endpoints. For example, FDA distinguished between different types of long-term reactive arthritis based in part on the degree of pain experienced.
Finally, the agency regulatory assessments of the health endpoints were for predicted risks (or statistical cases) rather than for individual, identifiable patients, and cover time periods over which HRQL impacts may vary. In theory we could have developed longitudinal models that identified distinct phases of each condition, the duration of each phase, and its probability of occurrence. Such models are difficult to develop, however, and require substantially more time and resources than were available. Instead, we encouraged the experts to consider the average or typical patient with each illness or injury and to assess the expected average HRQL impact over the course of the condition. In some cases, we divided the health conditions
into different phases. For the child restraints analysis, for example, we asked the experts to estimate the duration of the acute, rehabilitation, and long-term phases and to assign attribute levels separately to each phase. In the air quality case study, we split the cardiovascular disease endpoints into subcategories (based on age at incidence, severity, and disease progression), to better distinguish different health states.
Estimating “With Condition” HRQL
To estimate the HRQL impacts of each health condition averted by these regulations, we relied on several commonly used generic indexes: the EuroQol (EQ)-5D, the Health Utilities Index (HUI) Mark 2 and Mark 3, the SF-6D, and the Quality of Well-Being Scale (QWB).3 In addition, for the NHTSA case study, we applied an instrument which is now being created specifically to assess the longer term impacts of traumatic injury, the Functional Capacity Index (FCI). Chapter 3 and Appendix B of this report provide detailed information on each of these indexes.
Applying these indexes entails two steps. First, the characteristics of each health condition are matched to (or assigned) attribute levels under each domain of each index. For example, for the EQ-5D, this process involves determining whether the disease or injury leads to “severe,” “moderate,” or “no” impairments within five domains—mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Second, the resulting attribute responses are weighted to reflect the value placed on different levels of impairment. Each generic index relies on a particular scoring algorithm to develop relative values for particular health states; this algorithm is based on statistical analyses of the results of a valuation survey developed especially for the classification system of that index. These valuation surveys are described in Chapter 3; see especially Table 3-4.
In each case study, at least one of the HRQL approaches involved expert assignment of the attributes defined under a particular generic index. Although it is generally preferable to ask patients to complete this step, expert judgment is often used to provide a faster and less costly assessment. For expediency, we followed a simple expert judgment process that was not fully consistent with the best practices described in Chapter 3. For example, we recruited volunteer experts through our informal professional networks based largely on their availability. Consequently, the resulting groups may not represent the full range of subspecialties or types of patients relevant to
3 |
As discussed in Chapter 3 and shown in Appendix B, the HUI-2 and -3 include some differences in domains, in part because the HUI-2 was originally developed to assess health states among children. |
the assessment.4 A more sophisticated approach could use specific selection criteria to ensure a broad range of relevant expertise and experience as well as geographic stratification, and could involve asking specialty societies for nominations. We also did not work with the experts to ensure that they had a thorough or common understanding of the materials describing the health endpoints, the domain attributes, and the task itself. Nor did we attempt to resolve any inconsistencies either within the responses of an individual expert or across the responses from different experts. We used simple decision rules to fill in any missing data.
In a few cases, we relied on patient data from the available research literature rather than expert judgment. For the NHTSA study, we used QWB values from a study of trauma patients (Holbrook et al., 1999). For the EPA case study, we used preliminary condition-specific EQ-5D values estimated from the Medical Panel Expenditure Survey (MEPS) (Sullivan et al., 2005). In the EPA case study, we also transferred values from two patient studies selected from the Harvard School of Public Health’s CEA Registry (http://www.hsph.harvard.edu/cearegistry/), based on a review by Brauer and Neumann (2005). The case study approaches are summarized in Table A-1.
Comparing to “Without Condition” HRQL
To represent likely HRQL in the absence of the conditions of concern (i.e., once the regulation has been implemented), we used estimates of average population health broken out by age from major national population health surveys that included the relevant generic index questionnaire. This approach is equivalent to assuming that, in the absence of the hazard addressed by the regulation, affected individuals on average would have the same health status as the average member of the U.S. population in the same age group. In sensitivity analysis, we also compared the “with condition” HRQL estimates to a value of 1.0. This latter comparison is equivalent to assuming that, in the absence of the illness or injury, the affected individuals would be in perfect or optimal health.5
These age-adjusted estimates of average population health use the same underlying community-based valuation survey for each index (as discussed in Chapter 3) and were based on unpublished analyses prepared for the
TABLE A-1 Approaches for Determining “With Condition” HRQL
Rule |
Approach |
Indexes |
Data Source |
FDA Juice Processing |
Expert assignment |
EQ-5D, HUI-3, QWB, SF-6D |
Analysis of data provided by medical experts contacted by case study team |
NHTSA Child Restraints |
Expert assignment |
EQ-5D, HUI-2 |
Analysis of data provided by medical experts contacted by case study team |
|
Trauma patient survey |
QWB |
Analysis of patient data provided by Troy Holbrook, University of California, San Diego |
|
Expert judgment |
FCI |
Expert data and weighting formula provided by Ellen MacKenzie, Johns Hopkins University |
EPA Nonroad Diesel Emissions |
Expert assignment |
EQ-5D |
Analysis of data provided by medical experts contacted by case study team |
Population survey (MEPS) |
EQ-5D |
Preliminary analysis of self-reported HRQL provided by Patrick Sullivan, University of Colorado |
|
|
Transfer from Harvard Registry studies |
EQ-5D, HUI-3 |
Analysis of patient data from Oostenbrink et al. (2001) and Torrance et al. (1999) |
Committee’s use in the case studies.6 The estimates were provided by age and gender, and generally broken into 10-year age groups.
These population averages were missing estimates for very young and very old individuals. We assumed that, for ages 0 through 9 years, average health would equal perfect health (a value of 1.0); for ages 10 through 19, average health would be the midpoint between perfect health and the values estimated for ages 20 through 29; and for those older than the reported age
TABLE A-2 Without-Condition HRQL
ranges, average health would remain constant at the value reported for the eldest age group. This approach means that the HRQL impacts for young children will be the same regardless of whether the comparison is to perfect or average health, since a value of 1.0 is used for “without condition” HRQL in both cases.7
Table A-2 presents the estimates of average population health used in this analysis for selected ages, for males and females combined. These estimates are provided for illustrative purposes; the case study calculations used the full range of estimates available for each age group.
As is evident from the table, the estimates of average population health vary. This variation reflects several factors, including the differences in (1) the population surveyed to determine their health-related attributes; (2) the underlying valuation survey; and (3) the construction of indexes themselves. In combination, these factors generally lead to the highest average HRQL estimates under the EQ-5D and the lowest under the QWB. As expected, average HRQL declines with age under each index.
The comparison of HRQL with and without the conditions of concern is complicated by the assumptions that underlie the approach used to assign and value attributes under each index. In these comparisons, we adjusted the values depending on the source of the “with condition” esti-
mates. We summarize these adjustments below; examples of the effects of these different adjustments are provided later in the summary of the EPA case study. Once we used these adjusted values to calculate the decrement in HRQL associated with each condition, we multiplied the decrement by the duration of the condition (taking longevity into account) to estimate QALY losses.
Comparison to “With Condition” Values Based on Expert Assignment
Many researchers hypothesize that experts responding to the sorts of questionnaires used in the case studies implicitly compare the condition to perfect health, rather than to average health for an individual of a given age. Our interviews of the experts involved in the case studies generally reinforced this impression; they reported that they considered the impacts of the illness or injury on someone who is otherwise in good health; i.e., does not have other conditions that affect their HRQL. To reflect this assumption, we adjusted the condition-specific HRQL results proportionately when comparing them to average health, which declines with age. For the comparison to perfect health in our sensitivity analysis, we use the unadjusted values based on the experts’ attribute assignments.
For example, if the expert assessment results in a “with condition” value of 0.8, this value represents 80 percent of perfect health (i.e., of 1.0). If “without condition” health is 0.9 (based on the population average for an individual of the same age), then 80 percent of this value is 0.72. We would then use 0.72 as our estimate of “with condition” health when comparing to average health. This is equivalent to assuming that each expert was comparing the condition to perfect health and, if they had instead compared to age-adjusted average health, the HRQL with the condition would reflect the same proportionate reduction. While more sophisticated approaches could be developed for addressing this issue, we found that this approach was the most expedient option for the case studies.
Comparison to “With Condition” Values Based on Patient Self-Assessments
The NHTSA and EPA case studies also use patient data from previously completed studies. Three of these studies, the Holbrook et al. (1999) QWB estimates for injuries, the Torrance et al. (1999) HUI-3 estimates for chronic bronchitis, and the Oostenbrink et al. (2001) EQ-5D estimates for vascular disease, reflect all aspects of a patient’s health, not only the effects of the illness or injury of concern. This raises two issues. First, because HRQL generally decreases with age, these estimates may reflect comorbidities that would not be present in younger populations but would
increase in older populations. Second, the decrement in HRQL calculated from these estimates may overstate the effect of the condition, because the estimates may reflect health impairments that are not attributable to the condition of concern.
In these cases, we followed a two-step process. First, we compared the researcher’s results to the estimate of average health for an individual of the same age as the average person in the researcher’s sample, and determined the “with condition” HRQL as a percentage of the average (“without condition”) HRQL for that age. Second, we applied this percentage reduction to the HRQL estimates for all ages as relevant. This approach is equivalent to assuming that the proportionate reduction in HRQL is the same for every age, and differs from the approach used in the expert assignments.
We do not adjust the researchers’ values when comparing to perfect health; the decrement is the same in each year when compared to a constant value of 1.0. Thus, in this latter comparison, we are overstating the impacts of the health condition both because the values reflect HRQL decrements other than those related to the condition itself and because the affected individuals are not likely to be in perfect health throughout their lifetimes.
For example, the average age of the Holbrook et al. (1999) QWB sample of trauma patients was 36 years. If the Holbrook results for an injury were 0.7 and the estimate of average health for a 36-year-old was 0.8, then we assumed that HRQL with the injury was 87.5 percent of average health (0.7/0.8 = 0.875) regardless of the age of incidence. In the comparison to perfect health, we used the reported value of 0.7 without adjustment.
The preliminary EQ-5D estimates from MEPS used in the EPA case study are also based on data from persons reporting the condition; however, here we follow a different approach.8 In this case, the researchers separated out the effects of co-morbidities from the effects of the condition of concern in their statistical analysis. We used the condition-specific decrements directly when comparing to average health without the condition. In the perfect health comparison, we added the difference between average health and perfect health at each age to the decrement provided by the researchers. (This process means that the “with condition” values are the same in both scenarios because we make the adjustment to the decrements.) This approach leads to decrements that increase with age because the difference between perfect health and average population health increases over time, as illustrated earlier in Table A-2.
FDA JUICE PROCESSING REGULATION
In this case study, we estimated the cost-effectiveness of FDA’s 2001 juice processing rule. We selected this regulation as one of the Committee’s case studies because it allowed us to explore the effects of applying different HRQL measures to both short-lived and lifelong illnesses. It also provides an example of a regulation where the issuing agency used a monetized QALY measure in its BCA. In this case study, we applied four indexes: the EQ-5D, the HUI-3, the SF-6D, and the QWB, asking clinical experts to determine the attributes that best match the expected impacts of each illness.
FDA Analysis
The starting point for our analysis was the research conducted by FDA to support its rulemaking efforts (FDA, 1998, 2001). In its BCA, FDA quantified the health impacts that were most significant in terms of severity and probability of occurrence, focusing on four microbial pathogens: Bacillus cereus, Cryptosporidium parvum, Escherichia coli O157:H7, and Salmonella (non typhi). The effects of these pathogens include infections that result in gastrointestinal illness and may lead to reactive arthritis. Most effects are short-lived, lasting for a few days or weeks on average, although in a small number of cases the infection may lead to lifelong illness or death. FDA categorized these effects in terms of duration (i.e., average days of illness) and severity (i.e., mild, moderate, severe), determining severity based on whether the typical patient would be likely to seek medical attention and/or be hospitalized.
In its BCA, FDA valued these health impacts using a combination of approaches. Fatal cases were valued using a best estimate of $5 million per statistical life saved. (Statistical cases represent the aggregation of small risks across a large number of people; e.g., a fatality risk of 1 in 10,000 aggregated across 10,000 people would equal a statistical life.) Nonfatal cases were valued as follows (see also Box 2-3). First, FDA used a generic index, the QWB, to determine HRQL impacts. Analysts assigned the QWB attributes (which reflect functional status and include symptom/problem codes) that best corresponded to the HRQL impacts for each of the health endpoints considered. They then calculated the value of these impacts based on the standard QWB valuation formula. The index values were multiplied by the expected average duration of each health impact to estimate the quality-adjusted life-day (QALD) losses associated with each endpoint. These QALD losses were assigned a dollar value by converting the agency’s value of statistical life estimate to a daily value of $630. Finally, FDA added the costs of medical treatment to these monetized QALD estimates. The resulting per-case values (monetized QALDs plus medical costs) were mul-
tiplied by the number of cases averted to determine the dollar value of the benefits of the rule. These results are summarized in Table A-3 below.9
As indicated by the table, FDA estimated that present value of the annual benefits would total $151 million, applying a discount rate of 7 percent to the future year effects of those illnesses with long-term impacts.10 In comparison, FDA estimated that the annualized costs of the rule (including initial implementation and ongoing operations) would total $28 million. The rule thus results in monetized net benefits (benefits minus costs) totaling approximately $123 million per year. FDA noted that some, less significant, health effects were not quantified, such as those related to exposure to other pathogens and contaminants such as pesticides.
Case Study Analytic Approach
As discussed above, the Committee’s approach to estimating QALY impacts for these case studies involved three steps: (1) developing descriptions of each health outcome assessed; (2) applying different approaches to estimate the HRQL impacts of each outcome; and (3) calculating the difference between “with condition” and “without condition” HRQL and multiplying the resulting decrement by the duration of the impact.
For this case study, the background materials for FDA’s rulemaking (FDA, 1998, 2001) provided most of the information we needed to develop brief (one or two sentence) descriptions of each of the health endpoints listed in Table A-3. Our descriptions included information on the types of symptoms (e.g., diarrhea, abdominal pain, and nausea), indicated the pathogen causing the illness (B. cereus, C. parvum, E. coli O157:H7, or Salmonella—non typhi), noted the approximate duration of the symptoms (e.g., “expected to last less than one week,” “typically lasting throughout the individual’s remaining life span”), and indicated whether patients were likely to require medical attention or hospitalization.
We separated certain of the severe and chronic effects into subcategories to better reflect the varying health states that result, using data provided in FDA’s analysis. This led to descriptions of 17 separate nonfatal endpoints, including 13 related to infections and 4 related to reactive arthritis, as listed in Table A-4 in the next section. Five of these endpoints involve chronic lifelong conditions; the remainder are of short-term dura-
TABLE A-3 FDA Estimates of Annual Quantified Benefits
tion, lasting only a few days or weeks. The descriptions provided to the experts did not provide information on the average age of the affected individuals.
We then sent these descriptions to the medical experts, along with a list of the domain and attribute definitions for each generic index and instructions for characterizing each endpoint in terms of the attribute lev-
els. We asked the experts to consider a “typical” patient with each type of illness in completing this exercise. The experts included eight infectious disease specialists and five rheumatologists, who were asked to characterize or assign the endpoints related to their area of expertise using each of the generic indexes.
Once we received the assignments, we entered the results into an Excel spreadsheet model to calculate the summary index values, to compare the “with condition” results to average age-adjusted HRQL and to perfect health, and to multiply the resulting decrements by the duration of the health effect, following the approaches summarized above. We used FDA’s assumptions for average age at incidence and for the duration of the effects. Most of the effects are expected to occur on average in adulthood, except for severe E. coli infections, for which the average age at incidence was four years. For fatal cases and lifelong effects, we assumed that the average life expectancy of the affected individuals would extend with certainty to age 77, again consistent with FDA’s approach.11
The FDA analysis (along with more recent studies) suggests that pathogen-related infections may be more common or more severe in individuals with suppressed immune systems. We were not, however, able to quantify the extent to which such individuals would be disproportionately affected, nor were we able to estimate the HRQL of these individuals with or without the illnesses of concern. Our assumption that, in the absence of the pathogen exposure, individuals with suppressed immune systems would have the same health status as the average member of the general population is likely to overstate their “without condition” HRQL. The impact of pathogen-related illness on HRQL may also differ for these individuals.
Estimates of QALY Gains
The expert assessment process resulted in identification of domain attributes under each of the four indexes for each of the 17 health endpoints assessed. In general, we found that endpoints of increasing severity were often assigned similar attributes, meaning that the descriptions and/or attribute levels offered by the several indexes did not distinguish sufficiently among severity levels. When attribute assignments varied, they usually followed the expected pattern in that the assignments for severe cases indicated greater problems than the assignments for mild cases. The range between the minimum and maximum values for each attribute suggested that the experts sometimes varied significantly in their judgments about the
degree of problems imposed; however, we did not formally assess the extent or sources of this variation.
In Table A-4, we present the weighted values from the expert assignments, reporting median rather than average values as the estimate of central tendency because of the small number of experts involved. The results indicate the estimated HRQL with each condition (not the decrement from normal health), on a scale where one corresponds to perfect health and zero corresponds to death. This table excludes fatalities, which have a “with condition” value of zero.
The domain attributes assigned by the experts result in median index
TABLE A-4 Juice Processing Case Study: HRQL with Pathogen-Related Illness
values that decrease with the increasing severity of the illness as expected, in some cases dropping below zero (indicating that the weighted attributes taken together result in a value considered worse than death). For mild cases, the EQ-5D generally results in the values closest to optimal or perfect health, while the QWB results in the lowest values, but this pattern is not constant across the different pathogen-related endpoints.
The next step involved estimating the QALY losses averted by FDA’s juice processing rule. This included: (1) determining the decrement from “without condition” health for each condition; (2) multiplying the decrement by the duration of each condition to estimate the QALYs lost; and (3) multiplying the per-case values by the number of cases averted by the rule.12Table A-5 presents the resulting estimates of total QALYs lost for our base case scenario, where we assume that normal health (in the absence of the condition) would equal average age-adjusted health. As previously discussed, we assume that the expert assignment implicitly involved comparison to perfect health and that the decrement from average health would represent the same proportional reduction. This table presents the results discounted at both a 3 and 7 percent discount rate, reflecting current guidance for discounting in regulatory analysis (OMB, 2003a). Undiscounted, the total estimated losses range from 2,500 to 3,700 QALYs, depending on the index used.
This table indicates that the health effects that lead to the largest HRQL decrements (i.e., particularly severe E. coli cases, see Table A-4) are not necessarily the health effects that account for the largest proportion of the benefits of the rule.13 When adjusted for duration and number of cases averted, prevention of long-term reactive arthritis accounts for the largest share of the overall benefits across all of the indexes, although the exact proportion varies. (FDA’s original analysis was also dominated by the results for this endpoint, but these results are not included in the table because FDA did not compare their “with condition” results to an average health scenario.) In total, the HUI-3 leads to the largest estimate of QALY losses when compared to average health.
12 |
We express these losses as QALYs (rather than as QALDs as in the FDA analyses) for consistency with how these losses are usually reported in the research literature. |
13 |
Because the experts assigned the lowest attribute level in more than one domain for severe E. coli infections under the EQ-5D and HUI-3, the resulting HRQL values are less than zero (see Table A-4). Hence the estimates of QALY losses are greater than the duration of the illness. For example, at the age of incidence (4 years), we assume that average HRQL without the illness is 1.0, and find that the HRQL with the illness is negative 0.11 under the EQ-5D, for a decrement of 1.11 from average health. If we multiply this decrement by 365 days to reflect the impacts of the first year of the illness (1.11*365), the QALDs lost total 405, exceeding the number of days in the year. |
TABLE A-5 Juice Processing Case Study: QALY Losses, All Cases
Although the table reflects the new cases of illness associated with a one-year decrease in exposure, in some cases the effects of the illness are lifelong, and we use discounting to adjust the value of future year impacts. Discounting the long-term impacts at a 3-percent annual rate, rather than at 7 percent, increases the present value of the results as expected. The relatively large difference in the results occurs because the 3-percent rate raises the contribution of the long-term impacts to the total present value; i.e., it discounts future impacts by a smaller amount. The undiscounted results are even larger, ranging from 2,500 to 3,700 QALYs, because the long-term impacts are not discounted to reflect their timing.
For preventable mortality, the estimates vary across indexes because we compare a “with condition” value of zero to the age-specific estimates of average population health, which differ across indexes (see Table A-2). Application of the QWB results in the lowest estimates of average health over time. Hence it also results in the lowest estimates of QALY losses for fatal cases. If we assess preventable mortality without adjusting the life years lost for HRQL, the two fatal cases averted annually lead to the loss of 84 life years undiscounted; 47 years if discounted at 3 percent, or 27 years if discounted at 7 percent.
In Table A-6, we compare the results of the above analysis to the results of our sensitivity analysis, which assumes that the affected individuals would be in perfect or optimal health (a value of 1.0) in the absence of the pathogen-related illness. This comparison overstates the actual impact of the rule because the affected individuals are unlikely to be in optimal health throughout their lifespan in the absence of these exposures. However, we include this perfect health comparison since it is often found in the literature and underlies the original FDA approach. The table includes the results
TABLE A-6 Juice Processing Case Study: Sensitivity Analysis for QALY Losses
Scenario |
Case Study Expert Assessment (median) |
FDA QWB Results |
||||
Discount Rate |
EQ-5D |
HUI-3 |
SF-6D |
QWB |
||
Total QALY losses compared to average age-adjusted health |
3% |
1,463 |
1,864 |
1,293 |
1,298 |
N/A |
7% |
794 |
1,019 |
706 |
721 |
|
|
Total QALY losses compared to perfect health |
3% |
1,659 |
2,121 |
1,563 |
1,700 |
|
7% |
882 |
1,136 |
843 |
924 |
888* |
|
N/A = not reported in FDA analysis (FDA, 2001). *Adds life years lost for fatal cases to FDA’s QALY estimate for nonfatal cases. |
from FDA’s 2001 analysis, which also applied the QWB in comparison to perfect health.
This table indicates that comparison to perfect health increases the estimates of QALYs across the different indexes, as expected. The difference between average health and perfect health rises with age (see Table A-2), and hence has the largest impact on the results for illnesses with lifelong effects. The differences between the median results from the case study’s expert assignment using the QWB and the original FDA analysis (also based on the QWB) appear to stem largely from differences in the attributes assigned to the individual health endpoints. The FDA results are, however, within the same general range as the other estimates.
Cost-Effectiveness Ratios
Our final step involved reporting the cost-effectiveness ratios described in the Committee’s recommendations (see Chapter 5). We include three of the four recommended ratios because the fourth (comprehensive) ratio is not relevant in this case; the rule does not lead to quantified benefits other than those health risk reductions included in the effectiveness measure.
In these calculations, we use FDA’s estimates of annualized regulatory costs and health treatment cost savings. In both cases, FDA applies a discount rate of 7 percent. While we were able to recalculate the estimate of regulatory costs to reflect a 3 percent discount rate, we lacked the data necessary to recalculate the estimates of medical cost savings. Thus we use the same estimates of medical cost savings under both discounting scenarios, which understates these savings under the 3-percent scenario. In addition, the FDA estimates include medical expenditures only and do not include the other types of health treatment cost savings recommended for inclusion in these calculations. Hence the net costs used in these ratios are higher than they would be if we had been able to follow all of the Committee’s recommendations.
In Table A-7, we first report the costs per life saved and per life year saved, discounted at 3 and 7 percent. The cost estimate in each of these calculations reflects compliance costs only, including both recurring costs and the annualized value of the initial costs.14 Medical cost savings are not considered. We then report the health-benefits-only ratio using each of the alternative approaches to estimating QALY losses; in this case, we net out the medical costs savings from the regulatory costs. In all cases, this exhibit
TABLE A-7 Juice Processing Case Study: Cost-Effectiveness Ratios
reports QALY losses calculated as a decrement from average population health.
This table indicates that the costs per life saved and per life year saved are relatively high, because this rule averts only two cases of mortality per year. Once we add in the impacts of the nonfatal effects, as well as the associated medical cost savings, the ratios result in much smaller values. In general, the HUI-3 leads to the lowest cost per QALY, and the SF-6D leads to the highest, although the results for the EQ-5D, the SF-6D, and the QWB are very similar. Because a higher discount rate reduces the impact of future year QALY losses, the costs per QALY are higher under a 7 percent discount rate than under the 3-percent rate. All of these ratios would show lower costs per QALY if the results of our sensitivity analysis were used, because the comparison to perfect health increases the estimates of QALY gains.
FDA’s QWB results lead to estimates of cost-effectiveness within the same general range. FDA’s estimates compare to perfect health and are discounted at 7 percent. If we add the effects of preventable mortality (which FDA excluded from the QALY estimate) to the estimates for nonfatal effects, the result is a value of $26,000 per QALY. This estimate is similar to the cost-effectiveness ratios that result if we use the QALY estimates (from Table A-6) that compare to perfect health discounted at the same rate.
As noted earlier, this analysis does not follow some of the Committee’s recommendations. We did not assess the distributional and ethical implications of these regulations in detail, and the FDA analysis provides only limited information on these impacts. An example of the type of information that could be highlighted in such an assessment is provided in Chapter 4 of this report.
In addition, our analysis does not fully address the uncertainty in these estimates, as required under current government-wide guidance (OMB, 2003a) and as recommended by the Committee. Uncertainty is inherent in all the components of the analysis, regardless of whether the assessment is in the form of a CEA or BCA. Further investigation would be needed to determine which aspects of the analysis are most uncertain and to estimate the extent to which such uncertainty varies depending on the HRQL valuation approach used.
We did, however, explore the experts’ views on the assessment process in a series of brief phone interviews. The experts noted that the assessment was more difficult in cases for which a single endpoint represented an illness that has changing symptoms over time and that varies in its impacts across patients. In some cases, the experts found that the endpoints were not sufficiently distinguished to allow for level differences on the attribute scales, and the indexes included some attributes that appeared irrelevant or were improperly described for these particular health effects. Several ex-
perts thought that asking clinical experts to act as proxies for patients was problematic.
Other sources of uncertainty in our HRQL assessment relate to the indexes themselves. For example, the developers of each index calculated relative health state index values using different population surveys (see Table 3-4), and another set of population surveys were the basis for the Committee’s estimates of average age-specific health under each index (see Table A-2). Hence some of the variation in our results may reflect differences in the data sources used rather than solely reflecting differences in the indexes themselves.
Across all of the HRQL approaches, the estimates of the decrements associated with the conditions also may be misstated if a significant portion of those affected are in less good health than the general population, due, for example, to immune system problems. For these individuals, the difference between “with pathogen-related illness” and “without pathogen-related illness” HRQL may be different than assumed in our analysis, and the duration of the condition may also vary.
NHTSA CHILD RESTRAINTS REGULATION
Our second case study addressed an NHTSA regulation requiring anchoring systems for child restraints. We selected this regulation to explore issues related to valuing effects on children as well as alternative approaches for assessing the HRQL impacts of injuries. This case study also provided an example of NHTSA’s approach to regulatory analysis, which relies on estimates of ELS to value nonfatal health impacts.
We were unable to use injury data from NHTSA’s analysis of the child restraints rule for this case study, however. NHTSA used very broad injury categories that did not provide the descriptive information needed for characterizing HRQL impacts. Because the estimates of the number and types of injuries used in our analysis are quite different from the estimates in the child restraints rule and reflect a high level of uncertainty, the case study results are not comparable to the results of NHTSA’s original analysis and we do not present cost-effectiveness ratios.
In this case study, we first applied the EQ-5D and the HUI-2, asking medical experts to match the characteristics of each injury to the relevant attribute levels. We then used data from previously completed research to apply the QWB and the FCI to the same set of injuries.
NHTSA Analysis
We began this case study with a review of the analysis NHTSA completed for its child restraints rule (NHTSA, 1999a,b). This rule requires the
use of standardized anchor systems in motor vehicles and on child restraints. In its economic analysis, NHTSA quantified the costs and benefits of both rigid and flexible anchor systems, assessing two regulatory options that represented alternate approaches to complying with the final rule and a third option that it ultimately rejected.
To estimate the benefits of the rule, NHTSA combined data on deaths and injuries to children in seat restraints with data on the impacts of child restraint misuse from several sources, focusing on children ages zero to six. NHTSA made several modifications to these data, first adjusting for the number of injuries that would have occurred in the absence of restraints, then estimating the percent of all injuries associated with restraint misuse and the fraction of this misuse that would be eliminated by the anchor rule. The data on injuries and fatalities were reported by KABCO category, which classifies injuries based on the degree of incapacitation (killed (K), incapacitating injury (A), nonincapacitating injury (B), possible injury (C), and no injury (O)). NHTSA converted the estimates from the KABCO categories to the Abbreviated Injury Scale (AIS), using a standard algorithm that reflects the distribution of all crash-related injuries (not solely injuries to restrained children).
The AIS is a simple numerical system for ranking and comparing the severity of injuries based on the probability that the injury could be fatal. A score of 0 indicates that there were no injuries, whereas a score of 6 indicates that the injury was likely to be immediately fatal; intermediate scores of 1 through 5 indicate injuries of increasing threat to life. When multiple injuries occur, they are scored according to the most life-threatening injury; i.e., the Maximum AIS or MAIS. Examples of the types of injuries that fall into each category are provided in Chapter 2, Table 2-6.
To value these injuries, NHTSA applied its ELS approach, which first involves determining the costs and monetized QALY impacts for nonfatal injuries in each AIS category.15 See Box 2-4 for a description of the ELS approach. These monetary estimates are converted to ELS fractions by dividing the value for each injury category by the value of a fatality (estimated by NHTSA as roughly $3 million). These fractions are then multiplied by the number of injuries averted in each category and added to the number of fatalities, to determine the total ELS value for each regulatory option. The ELS values for each AIS category are calculated periodically based on data for all types of crashes nationally, then applied across the
subsequent regulatory analyses.16 More information on this approach is provided in Chapter 2.
For the child restraints rule, the results imply that, on average, 58 injuries were equivalent to one fatality, given the severity of the injuries produced by NHTSA’s standard conversion formula. The results for each of the options assessed are reported in Table A-8 below; the table suggests that the regulatory options considered led to almost identical ranges of benefits.
NHTSA determined that the national costs of the final rule were most likely to average $152 million annually (in 1996 dollars), with a range from $123 to $167 million. This best estimate reflects the less expensive of the two implementation options (a nonrigid restraint attachment and rigid vehicle anchor). The alternative option permitted (both rigid) was estimated to cost $217 to $256 million annually, while the rejected approach (both nonrigid) was between these two estimates, at $149 to $196 million per year.
NHTSA then calculated the cost-effectiveness of the final rule by dividing the compliance costs by the ELS estimates reported in Table A-8. The results indicated that the costs per ELS ranged from $1.5 to $2.7 million, without discounting. NHTSA also presented several sensitivity analyses, including several that discounted the estimates of equivalent fatalities at different rates. (This discounting reflects the fact that the costs of the rule would be incurred in the year in which the vehicle or car restraint is purchased, while the benefits accrue over the several-year period for which the vehicle or car restraint is used.) At the time that the analysis was completed, OMB recommended application of a 7-percent discount rate, which led to a cost per ELS ranging from $2.1 to $3.7 million.
In this analysis, NHTSA did not report a total dollar value for all of the injuries and fatalities averted by the rule, and hence did not calculate net benefits (benefits minus costs). In more recent analyses, NHTSA has used its estimates of the dollar value of injuries and fatalities in each AIS category (including expenditures and monetized QALY impacts) in both BCA and CEA to determine net benefits as well as cost-effectiveness.
Case Study Analytic Approach
For this case study, the analysis of QALY impacts was more complicated than in the FDA analysis. Our first step involved identifying a readily accessible source of more detailed injury descriptions that could be valued
TABLE A-8 NHTSA Estimates of Annual Quantified Benefits
using generic HRQL indexes. Based on advice from NHTSA staff, we relied on data from the agency’s National Automotive Sampling System, Crashworthiness Data System (NASS-CDS) for the years 1999–2003. While this system includes data on thousands of crash victims, only 22 of the sampled cases involved injuries to children in child restraints. These sample cases represent roughly 1,752 cases nationwide (including 160 that are immediately fatal); however, NHTSA staff caution that the standard error associated with extrapolating from this small number of sample cases is quite large. As a result, we did not adjust our estimates for comparability with the estimates of cases averted for the different injury classes in the NHTSA child restraints analysis. (The regulation was not expected to prevent all injuries to restrained children, even after all vehicles and restraints in use are equipped with the anchors.)
The injuries reported for these 22 cases are provided in Table A-9 below. For each case, the table indicates the sample weight, or multiplier, that is applied to the sample values to extrapolate to the national population. In addition, the exhibit indicates the status of the child immediately after the accident and lists the individual injuries incurred. The final column reports the AIS classification for the case; the MAIS for cases with multiple injuries is marked with an asterisk (*).17
TABLE A-9 Child Restraints Case Study: Injuries to Restrained Children, Ages 0–6, 1999–2003
Case Number |
Weighting Factor |
Injury Description |
MAIS |
1 |
21.29 |
Nonfatal (hospitalized)
|
2 |
2 |
54.34 |
Fatal
|
3 |
3 |
18.81 |
Nonfatal (hospitalized)
|
3 |
4 |
411.33 |
Nonfatal (hospitalized)
|
2 |
5 |
85.65 |
Fatal
|
6 |
6 |
3.41 |
Fatal
|
5 |
7 |
124.43 |
Nonfatal (hospitalized)
|
5 |
8 |
8.39 |
Nonfatal (transported and released)
|
2 |
Case Number |
Weighting Factor |
Injury Description |
MAIS |
9 |
50.12 |
Nonfatal (hospitalized)
|
5 |
10 |
8.24 |
Nonfatal (transported and released)
|
2 |
11 |
37.29 |
Nonfatal (hospitalized)
|
3 |
12 |
75.56 |
Nonfatal (hospitalized)
|
2 |
13 |
16.03 |
Fatal
|
4 |
14 |
145.31 |
Nonfatal (transported and released)
|
2 |
15 |
1 |
Nonfatal (hospitalized)
|
2 |
16 |
128.9 |
Nonfatal (hospitalized)
|
4 |
17 |
61.87 |
Nonfatal (transported and released)
|
2 |
18 |
85.65 |
Nonfatal (hospitalized)
|
3 |
19 |
1 |
Fatal
|
5 |
Case Number |
Weighting Factor |
Injury Description |
MAIS |
|
|
|
|
20 |
7.75 |
Nonfatal (hospitalized)
|
3 |
21 |
382.64 |
Nonfatal (hospitalized)
|
2 |
22 |
23.03 |
Nonfatal (hospitalized)
|
2 |
NOTES: Injury descriptions are transferred verbatim from the NHTSA file without editing. NFS = Not further specified; GCS = Glasgow Coma Scale; OIS = Organ Injury Score. *indicates MAIS injury for multiple injury cases that are not immediately fatal. Case 11 included two MAIS 2 injuries; we identify the first (injury 11a) as the MAIS because it generally results in larger HRQL decrements. SOURCE: NASS-CDS data provided by Jim Simons, NHTSA, December 7, 2004. |
The NASS-CDS data did not include information on the duration of the injury or on life expectancy (NHTSA, 2002b). We estimated duration using the same data sources as used to assess HRQL, as described below. To estimate life expectancy without the injury or fatality, we used conditional survival rates, similar to the approach NHTSA uses in its ELS assessment.18 For the case study, we relied on data on average U.S. mortality rates for each year of life from detailed life tables (CDC, 2002), and calculated the probability of surviving to each year of age conditional on having survived to the previous age. With the exception of the five immediately fatal cases and two of the 17 nonfatal sampled cases, the injuries were not expected to affect life expectancy. For the traumatic brain injury in case 7, we used data from Harrison-Felix et al. (2004) to assess the reduction in life expectancy; for the spinal cord injury in case 9, we relied on data from Frankel et al. (1998).
In completing the analysis, we applied some simplifying assumptions due to data and time constraints. First, we assumed that the average age of the affected children was 3 years, and that they reflected the same gender distribution as the general population of the same age. Although injuries to a newborn could have quite different effects than would the same injury for a 3- or 6-year-old, the sources used in our analysis did not provide information on age-related HRQL differences for young children. Second, we treated these injuries as if they all occurred in a single year, rather than spread out over a 5-year period. We used discounting only to reflect the time value of averting the future year HRQL impacts associated with an injury that occurs in the current year; we did not discount the different years of incidence in the NASS-CDS data set.
Our assessment of HRQL impacts involved the use of four generic indexes. For two of these indexes, the EQ-5D and the HUI-2, we asked five medical experts to match the characteristics of each injury to the relevant index attributes, following a process similar to that applied in the FDA case study. We also asked the experts to assess duration, breaking each case into three time periods: the acute, rehabilitation, and long-term phases. We requested that they assess each injury separately and assess the combined effects of all injuries for each of the multiple injury cases.
For the other two indexes, we used values from previously completed research. For the QWB, we relied on data provided by Troy Holbrook of the University of California, San Diego. Holbrook’s team used patient self-assessments to determine the attributes associated with various injuries, for individuals age 18 or older (Holbrook et al., 1999).19 The resulting HRQL estimates are available by body region and injury severity (based on the six major AIS categories) for four time periods: predischarge, and at 6, 12, and 18 months. We matched these data with the body regions and AIS scores for each injury in our data set, focusing on the injury identified as the MAIS in multiple injury cases. We applied the Holbrook predischarge values to the hospitalization period, then applied the 6-month values from discharge (or injury date, if not hospitalized) to 6 months, the 12-month values from 6 to 12 months, and the 18-month values from 12 months through the remaining lifespan.
The fourth index used was the FCI, which is currently being developed (with NHTSA support) to measure the impacts of nonfatal injuries on functional status. It differs from the other indexes in that it is not intended to reflect all aspects of HRQL. Furthermore, it is not yet widely validated or
used. Ellen MacKenzie of Johns Hopkins University provided predicted 12-month FCI scores based on the AIS descriptions for each injury contained in our database. For most of the nonfatal injuries, the scores indicated that the individual would have returned to normal functioning at the 12-month mark; functional limitations persisted in only 5 of the 17 sampled cases with nonfatal injuries. MacKenzie reported FCI values for each individual injury in each of these five cases. Values for multiple injury cases were based on the worst score in each domain across all of the injuries incurred.
Because the FCI only provides 12-month scores at this point in its development, we did not use it to assess lifetime impacts. Instead, we compared the 12-month FCI values to the values for the same time under the EQ-5D, HUI-2, and QWB. For each index, we use the “with injury” values that reflect comparison to perfect health (a value of 1.0), because average population values were not available for the FCI.
Estimates of QALY Gains
The first step in applying the above indexes involved determining HRQL with the injuries. As discussed above, for the EQ-5D and HUI-2 we asked medical experts to determine the duration and attribute descriptions that best matched the likely impacts of each of the injuries listed in Table A-9. To better understand the variability in these estimates, the Committee commissioned a statistical analysis of the expert ratings to determine the extent of agreement both within and across the different indexes (Mason, 2005). The results indicated that, while the experts differed in the attributes they selected, the extent of these differences was not significantly affected by which index was used. The major differences in the results were due to the varying estimates of duration rather than to the differences in the estimates of HRQL in each injury phase.
For the QWB, the attribute data were provided by adult patients and reflected all aspects of their health, not simply the effects of the injury. Inspection of the resulting HRQL estimates suggests that the QWB results appear more uniform across cases than the estimates using the other indexes. At least some of this difference results from disparities in the data sources; for example, the QWB data set includes injury cases that may vary less in severity than the NASS-CDS cases assessed by the experts (e.g., it includes only hospitalized patients), and it used a more aggregated categorization scheme (i.e., classification by body part and AIS rather than by individual injury).
The next step in the analysis involved estimating the QALY losses that could be avoided if all of these injuries were averted by a hypothetical regulation. This step included (1) determining the decrement from “without condition” health for each phase of each injury; (2) multiplying the
decrement by duration and summing across the phases for each injury to estimate QALY losses; and (3) multiplying these per case values by the sample from Table A-9.
Table A-10 provides the resulting estimates of total QALY losses for the EQ-5D, HUI-2, and QWB, assuming that normal health (in the absence of the injury) would equal average population health for an individual of the same age, and applying both 7 and 3 percent discount rates. Undiscounted, the results range from 21,000 to 27,000 QALYs. If we consider only the
TABLE A-10 Child Restraints Case Study: QALY Losses, All Cases
160 cases nationally that are immediately fatal, the life-year losses (unadjusted for HRQL) total 12,000 life years undiscounted; 4,800 years if discounted at 3 percent, or 2,400 years if discounted at 7 percent.
The table indicates that the QALY losses for each endpoint vary in the extent to which they are similar across the three indexes; the QWB results in the largest estimates of total losses, followed by the EQ-5D and then the HUI-2. Not surprisingly, the largest values are generally associated with those fatal cases and severe injuries with the largest sample weights, reflecting their comparatively high per-case values and the number of cases represented nationally.20 However, under the QWB, some of the more minor (e.g., MAIS 2) injuries also have relatively large values, reflecting the lower variability of the QWB estimates, which are magnified in cases where the sample is large. Several cases have very small values, usually because they reflect injuries with only short-term impacts. The cases with values of 0 represent those where the experts believed that any impairments would not be noticeable, given the attribute definitions used under the relevant index.
Discounting the long-term impacts at a 3-percent rate, rather than at 7 percent, increases the present value of the totals, as expected. It does not affect the values for the short-term effects because we did not discount the first-year values. Larger differences occur for the long-term impacts because the 3 percent rate increases the contribution of future year effects to the total present value.
Our sensitivity analysis, presented in Table A-11, indicates that comparison to perfect health (a value of 1.0) rather than average health increases the estimates of QALY losses across the different approaches, as expected. This difference is moderated, however, by the fact that we assume that HRQL is 1.0 for the young children considered in this analysis under both the average health and perfect health scenarios. Average HRQL decreases with age (see Table A-2) and hence has the largest impact on the results for those injuries that have lifelong effects. The difference between the average and perfect health results is larger under the QWB because the sensitivity analysis does not include adjustment for the use of adult values, which include decrements unrelated to the injury. In contrast, the expert assignment approach used with the EQ-5D and HUI-2 reflects injuries to children. The results for individual endpoints show similar patterns to the results reported in Table A-10 and continue to be dominated largely by those fatal and severe cases with the highest sample weights.
TABLE A-11 Child Restraints Case Study: Sensitivity Analysis for QALY Losses
Scenario |
Discount Rate |
Case Study Expert Assessment (median) |
Holbrook QWB Results (median) |
|
EQ-5D |
HUI-2 |
|||
Total QALY losses compared to average age-adjusted health |
3% |
8,998 |
8,305 |
11,236 |
7% |
4,629 |
4,263 |
5,992 |
|
Total QALY losses compared to perfect health |
3% |
9,717 |
9,040 |
19,862 |
7% |
4,832 |
4,469 |
9,822 |
Because of the data limitations discussed earlier, for the FCI we compare the 12-month values to the 12-month values for the other three indexes, rather than using it to assess lifetime effects. We focus on the five cases with injuries that were identified (by MacKenzie) as affecting functioning at the 12-month mark.21 (Under the FCI, the remaining 12 nonfatal injury cases are expected to result in full recovery by this time.) The results of the comparison are provided in Table A-12, based on the perfect health scenario for consistency with the FCI estimates.
The estimates vary in the extent to which they appear consistent across indexes, due to differences between the indexes themselves as well as in the data sources and methods used in the analyses. For example, the characterization of injuries according to the FCI resulted from a more extensive and collaborative expert process than used by case study team for the EQ-5D and HUI-2. The FCI and QWB estimates are based on injuries to adults, while the EQ-5D and HUI-2 attribute assignments reflect injuries to children. In addition, the QWB patient assessments do not separate the impact of these injuries from other factors affecting HRQL, and the data are reported for broad injury categories. For all indexes, the estimates used reflect values for adults rather than children.
As noted earlier, we did not compare these results to the results of NHTSA’s regulatory analysis because our estimates of the numbers and types of injuries differ significantly from the data used to support the rule
21 |
The median results from the expert assessment suggest that six cases would have long-term HRQL effects: the five in Table A-12 plus case number 8. |
TABLE A-12 Child Restraints Case Study: HRQL with Injury, 12 Months after Injury
and reflect a very high degree of uncertainty. We were unable to investigate the reasons for the differences between our estimates and those used in the NHTSA analysis, and hence did not attempt to adjust our data to better reflect the rule’s likely impacts.
Similar to the FDA case study, the Committee did not conduct a detailed assessment of the distributional and ethical implications of these findings, nor did we formally address the uncertainty in the estimates. However, we believe that one of the major sources of uncertainty in this case study relates to the use of adult health state index values for children. While the use of adult values is often necessitated by limitations in the available data, it raises difficult practical and ethical questions as discussed in more detail in the main text of this report.
We also discussed the attribute assignment process with the experts involved in this case study, who indicated that it was difficult and time consuming due to need to assess a large number of injuries. Estimating duration was particularly challenging. The experts’ experience with this task suggests that it may be preferable to use estimates from the research literature (similar to the approach used in the other case studies), or to ask the experts to assess HRQL at prespecified intervals (e.g., at 3, 6, 12, and 18 months postinjury). In addition, the experts received relatively little descriptive information and indicated that more information on the injuries
would have been helpful. They also observed that medical specialization affects how one classifies health impacts; a larger and more broadly representative panel would have been desirable.
The experts’ comments on the domains and attribute scales used in the different indexes were similar to those raised in the FDA case study, but the problems appeared to be exacerbated by the need to apply the indexes to children. The experts noted that the attribute scales do not provide enough variation within each domain to describe some injuries adequately. In addition, the scales are not always applicable to young children, who are not likely to engage in some of the activities described. Finally, the experts indicated that using these indexes to assess the long-term effects of injuries incurred in childhood is particularly difficult.
EPA NONROAD ENGINE AIR EMISSIONS REGULATION
The third case study was based on an EPA regulation establishing air emissions standards for nonroad engines as well as standards for diesel fuel. This regulation enabled the Committee to explore issues related to valuing the effects of chronic illness and preventable mortality. In addition, it provided insights into the data and methods EPA uses in its analysis of air pollution rules, which account for a sizable fraction of the regulations likely to be affected by the Committee’s recommendations. This case study also provided an example of a rule with quantified nonhealth (visibility) benefits, as well as significant health and environmental benefits that could not be quantified.
In this case study, we considered a subset of the cardiovascular and respiratory effects included in EPA’s analysis, focusing on those endpoints that account for the majority of the monetized benefits of the rule: preventable mortality, chronic bronchitis, and cardiac disease following nonfatal acute myocardial infarction (AMI). For simplicity, we omitted the less significant endpoints from our comparison of HRQL approaches. While these other endpoints involve short-lived events and exacerbations of preexisting illnesses that pose a number of conceptual and analytic challenges, evaluation of their HRQL impacts may be desirable within the framework of a regulatory CEA.
We used three approaches to estimate QALY losses in this case: (1) asking clinical experts to assign EQ-5D attributes; (2) applying EQ-5D index values based on statistical analysis of MEPS data; and (3) transferring estimates from selected studies in the CEA Registry. The approaches vary only in the valuation of nonfatal effects. Under all three approaches, we use identical values for averted mortality, comparing a “with condition” value of zero to the EQ-5D index value that would be otherwise expected at each age over the remaining lifespan.
EPA Analysis
The foundation of our case study was the analysis supporting EPA’s final nonroad diesel rule (EPA, 2004a,b). EPA’s BCA quantified the impact of reduced fine particulate matter (PM) emissions on a number of respiratory and cardiovascular health effects as well as preventable mortality. To predict cases averted and assess benefit values, EPA relied on its BenMAP model (http://www.epa.gov/ttn/ecas/benmodels.html), which it developed to support a wide range of air pollution rules. This model combines estimates from selected epidemiological studies with detailed data on population characteristics and emissions changes to provide both summary and disaggregate estimates of impacts. It also supports probabilistic analysis of uncertainty.
To value averted cases of mortality, EPA applied a range of estimates of the value of statistical life, with a mean of $5.5 million. EPA adjusted these estimates to reflect the effects of real income growth over time and the lag between exposure reduction and reduction in mortality rates. For chronic bronchitis and restricted activity days, EPA adapted dollar values from stated preference studies of individual WTP. For other nonfatal effects, EPA relied on data on the medical costs of illness and lost earnings due to the lack of suitable WTP estimates. EPA also used WTP estimates to value changes in visibility at selected recreational areas.
EPA’s primary estimates of health and other impacts are provided in Table A-13. The table reports annual impacts as of the year 2030, when virtually all engines in use are expected to meet the standards. As indicated by the table, EPA estimates that annual monetized benefits will total $80 billion or $83 billion, depending on which discount rate is used.22 The dollar value of these benefits is determined largely by the impact of averted mortality, which represents over 90 percent of the monetized effects.23
EPA’s cost analysis addressed the short- and long-term impacts of the rule on the costs of producing and operating engines of various types as well as refining and distributing fuel. EPA then used a multimarket model to assess the economic impacts of these cost changes. The results indicated that the social welfare costs of the final rule would total approximately $2.0 billion annually as of 2030.
TABLE A-13 EPA Estimates of Annual Quantified Benefits
The monetized net benefits (benefits minus costs) of the final rule will thus total approximately $78 to $81 billion annually as of 2030, depending on the discount rate used. EPA accompanied these estimates with a discussion of the distribution of the impacts as well as quantified analyses of different sources of uncertainty. Because EPA was not able to quantify a number of other health and ecological benefits associated with reductions in a variety of pollutants, EPA concluded that the monetized benefits may significantly understate the total benefits of the regulations.
Case Study Analytic Approach
This case study follows the same general approach as the other case studies. We began by developing disease descriptions, relying primarily on EPA’s regulatory impact analysis and the epidemiological studies EPA used in its risk assessment (Pope et al., 2002, for preventable mortality; Abbey et al., 1995, for chronic bronchitis; and Peters et al., 2001, for nonfatal AMI). As needed, we supplemented these data with information from a 2004 CEA of a one-microgram reduction in PM prepared by Bryan Hubbell of EPA (which EPA subsequently updated and applied in its Clean Air Interstate Rule (EPA, 2005), as discussed in Chapter 2). While the Hubbell analysis considered a different reduction in pollution levels and used population data for a different year (2000 rather than 2030) than used in the nonroad rule analysis, it reflects the same underlying risk studies and the same general modeling approach.
We used the descriptions from the EPA data sources directly in the case study approaches that relied on existing research; i.e., the MEPS-based EQ-5D and the transfer of values from the Harvard Registry. The expert assignment approach required more detailed condition descriptions. For chronic bronchitis, we described three severity categories and instructed the experts to assume that the patient is in middle age. In our calculations, we assumed the chronic bronchitis would last for the remainder of the affected individuals’ lifespan but did not consider its effects on life expectancy nor model the likely worsening of symptoms over time.24 Our assessment of life expectancy used conditional survival rates as in the NHTSA case study, similar to the approach used in Hubbell (2004) and other EPA analyses.
For AMI, the development of disease descriptions for the expert assignment process was more complicated. First, to assess the likelihood that AMI survivors would develop angina and/or congestive heart failure, we used the approach in Hubbell (2004), which assumed that 10.2 percent of survivors would experience congestive heart failure and angina, 9.8 percent would experience congestive heart failure without angina, 40.8 percent would experience angina only, and 39.2 percent would experience neither congestive heart failure or angina. We then split the cases in each of these categories into severity classes and further subdivided them by whether the age at incidence was above or below 65 years. These steps resulted in 22 subcategories for the post-AMI progression of heart disease.
For these post-AMI disease states, we assumed that cardiac disease would last for the remainder of the affected individuals’ lifespan and again
did not model the likely worsening of symptoms over time. We did, however, consider the effects of cardiac disease on life expectancy. We adjusted the population average conditional survival rates using different factors for AMI cases with and without congestive heart failure. Consistent with Hubbell (2004) and EPA’s 2004 regulatory analysis, we assumed that the years lost to preventable mortality from cardiac disease were included in the separate estimates of fatal cases and (to avoid double-counting) did not assess them as part of the cardiac disease scenario. Hence the reduction in HRQL associated with the nonfatal endpoints is assessed only for the affected individuals’ remaining lifespans.
For mortality, no disease descriptions were needed because “with condition” HRQL is zero in all cases. However, assessing PM-related mortality requires addressing a number of other issues. A key question raised in EPA’s analysis is whether the affected individuals would have had the same remaining lifespan as the general population in the absence of the pollution. This issue has been the subject of some debate; however, EPA generally assumes that the distribution of underlying conditions is the same as for the overall population of the same age. (Exposure to PM most affects the risk of death among elderly individuals—age 74 on average, and there is a high prevalence of preexisting heart disease and other illnesses among the general population at this age.) EPA also adjusts for the time lag between pollution reductions and reductions in mortality among the adult population; this adjustment is not made in assessing infant mortality. In general, we followed the same base case assumptions as used in EPA’s primary benefits estimates but do not replicate the sensitivity analyses that EPA reports.
Once we had developed the descriptive information needed for the assessment of each endpoint, we implemented three approaches for estimating “with condition” HRQL. Our first approach, expert assignment, was similar to that used in the FDA and NHTSA case studies. For the EPA study, we asked two groups of experts (six respiratory disease specialists and five cardiologists) to apply the EQ-5D attributes to those endpoints related to their area of specialization.
In the second approach, we applied preliminary estimates of HRQL decrements from a recently developed catalogue of EQ-5D values. This catalogue has since been published by Patrick Sullivan and colleagues (2005) and used a population survey (MEPS) to develop EQ-5D for a number of chronic conditions. (The catalogue is described in Chapter 3.) The researchers first calculated EQ-5D index values for those respondents reporting each condition, then used regression analysis to determine the marginal impact of the condition of interest alone, separating out the effects of any comorbidities. We used preliminary estimates of these marginal decrements in our analysis for chronic bronchitis, AMI, angina pectoris, and heart
failure, based on data provided by Sullivan for each condition (reported by three-digit International Classification of Disease Version 9 (ICD-9) code), and combined the decrements as needed to reflect each of the health endpoints assessed. These preliminary estimates differ from the updated estimates provided in the published study.
The third approach involved the transfer of estimates from studies selected from the CEA Registry, based on a review conducted by Brauer and Neumann (2005). Brauer and Neumann identified 127 respiratory and cardiovascular health states with index values in the database, focusing on studies published after 1994. They identified those estimates most suitable for application to this case study based on the similarity of the health state and appropriateness of the methodology, as discussed in Chapter 3.
Based on our review of these studies, we identified two that appeared to provide the most suitable estimates for this case study. For chronic bronchitis, we used a Canadian study that compared alternative medications and applied the HUI-3 to develop one-year average HRQL values covering both chronic and acute phases of the condition (Torrance et al., 1999). For post-AMI health states, we used a Dutch study (Oostenbrink et al., 2001) that employed the EQ-5D to estimate HRQL in patients after infrainguinal bypass surgery. This study included HRQL estimates for a subset of patients who suffered an AMI during the follow-up period.25
The two studies selected from the CEA Registry use different indexes, raising questions about the appropriate index to use for preventable mortality. It was difficult to find a justification for using either the EQ-5D (consistent with the AMI study) or the HUI-3 (consistent with the chronic bronchitis study), or for averaging the results under each index to establish “without condition” HRQL. For simplicity and comparability, we applied the same EQ-5D estimates to assess mortality for the CEA Registry analysis as in the other two HRQL approaches used in this case study.
Estimates of QALY Gains
The three approaches applied in this case study addressed different respiratory and cardiovascular endpoints broken out in different ways. Our expert assignment approach used 25 subcategories characterized by severity and symptoms; our application of the MEPS-based catalogue used four ICD codes (one for chronic bronchitis and three for cardiac disease) in
various combinations; and our benefits transfer from the CEA Registry studies used one estimate for chronic bronchitis and one estimate for all post-AMI conditions.
In the expert assignment, we found that the results did not always vary across the severity categories. The EQ-5D allows a choice of three attribute levels within each domain. In some cases, individual experts assigned the same attribute levels to cases of differing severities. The assignments also indicated that the experts disagreed about whether certain conditions would impose no, moderate, or severe problems in a particular domain. Where the estimates varied across endpoints, they generally followed the expected pattern, showing increasing problems for cases with increasing severity. Mild cases resulted in median HRQL values close to 1.0, indicating a negligible effect on the quality of life. In contrast, the most severe form of congestive heart failure led to HRQL values close to zero, with median estimates of 0.05 or less. In general, the median values were identical for the two age groups specified in the AMI scenarios (those above and below 65 years).
The QALY estimates varied across the three approaches. In Table A-14, we provide the results for the average age at incidence under each approach, in comparison to both average and perfect health. (The adjustments made in these comparisons are described in the “General Approach” section, above.) While these adjustments seem sensible within the context of each approach, they lead to inconsistencies in the relationships across the results.
As illustrated by the table, for the expert assessment, the “with condition” values (and the decrement from normal health) are consistently lower under the average health scenario than under the perfect health scenario; we applied the same percentage reduction to a lower value (average “without condition” HRQL is less than perfect HRQL). For the MEPS-based EQ-5D catalogue, the “with condition” values are the same under both scenarios, but the decrement is larger under the perfect health scenario and increases with age (because we add the difference between average and optimal health, which grows with age). For the values taken from the CEA Registry studies, which scenario results in larger estimates depended on age, because we anchored the percentage reduction from average population health at the average age of the underlying study samples. The average age in the chronic bronchitis study is 55 years, slightly higher than the average age at incidence used in our analysis (Torrance et al., 1999). For the AMI study, the average age of the study sample is 69 years (Oostenbrink et al., 2001).
We multiplied the estimates of decrements from “without condition” health by duration (taking life expectancy into account) to determine the QALY losses associated with each nonfatal endpoint as well as with pre-
TABLE A-14 Nonroad Diesel Emissions Case Study: HRQL with Illness, at Average Age of Incidence
Endpoint |
Average Age at Incidence |
Base Case, Compared to Average Health |
Sensitivity Analysis, Compared to Perfect Health |
||||||
Without Conditiona |
With Condition |
Without Conditiona |
With Condition |
||||||
EQ-5D Expert Assessmentb |
EQ-5D MEPS Cataloguec |
Transfer from Selected Studiesd |
EQ-5D Expert Assessmentb |
EQ-5D MEPS Cataloguec |
Transfer from Selected Studiesd |
||||
Nonfatal chronic bronchitis |
49 |
0.88 |
0.34–0.88 |
0.81 |
0.82 |
1.00 |
0.39–1.00 |
0.81 |
0.78 |
Nonfatal acute myocardial infarction |
53 |
0.85 |
0.03–0.85 |
0.70–0.81 |
0.60 |
1.00 |
0.03–1.00 |
0.70–0.81 |
0.58 |
78 |
0.78 |
0.02–0.78 |
0.63–0.74 |
0.55 |
1.00 |
0.03–1.00 |
0.63–0.74 |
0.58 |
|
Preventable mortality—adults |
74 |
0.78 |
0.00 |
0.00 |
0.00 |
1.00 |
0.00 |
0.00 |
0.00 |
Preventable mortality—infants |
0 |
1.00 |
0.00 |
0.00 |
0.00 |
1.00 |
0.00 |
0.00 |
0.00 |
NOTES: Ranges reflect the results for the different health state subcategories assessed for each endpoint. aWithout condition values for average health are based on the EQ-5D, except for the values for chronic bronchitis under the Harvard Registry approach, which are based on the HUI-3. At age 49, the value for average population health is 0.88 under both indices. bFor the expert assignment, “with condition” health is assumed to be the same fraction of average health as of perfect health for all years of age affected. cFor the EQ-5D MEPS catalogue, numerical decrements from average health are assumed to be constant across all years of age, and the difference between “without condition” average and perfect health is added to this decrement for the perfect health comparison. dFor the transfer from the CEA Registry studies, “with condition” health is assumed to be a constant fraction of “without condition” health; this fraction is calculated based on the average age of the samples used in each study. SOURCES: Case study team analysis of data from the following sources. Expert assignment: Data provided February to April, 2005. MEPS data: preliminary results provided by Patrick Sullivan, April 4, 2005. CEA Registry: Torrance et al. (1999) and Oostenbrink et al. (2001). |
ventable mortality. We report the results of these calculations in Table A-15. The results reflect the losses for all cases, assuming that the health status of affected individuals would be the same as the population average for individuals of the same age in the absence of the pollution-related health effects. These estimates represent the lifetime losses for all cases averted by the annual reduction in pollution levels as of the year 2030; using discounting to reflect the future year impacts of the new cases, i.e., their lifetime effects. Undiscounted, the results range from 160,000 to 170,000 QALYs. Without adjustment for HRQL, the life-year losses associated with the cases of preventable mortality (including fatalities for 12,000 adults and 22 infants) total 130,000 life years undiscounted; 93,000 life years if discounted at 3 percent; and 64,000 life years discounted at 7 percent.
As shown in Table A-15, the three approaches to estimating HRQL impacts yield differing results. Because the estimates for mortality are identical under all three approaches, these differences are driven by the approaches used to value the nonfatal endpoints. The expert assignment yields values for chronic bronchitis that are more than twice as large as the estimates from the EQ-5D MEPS catalogue or CEA Registry studies. For
TABLE A-15 Nonroad Diesel Emissions Case Study: QALY Losses, All Cases
the AMI endpoints, the CEA Registry studies lead to estimates of QALY losses that are greater than the results under the expert assessment or the EQ-5D catalogue, possibly because that study addressed more severe cases than the average post-AMI population. The estimates of the number of cases avoided, age at incidence, and life expectancy are constant across all three approaches; hence these results reflect the differing estimates of the HRQL decrement associated with each condition.
Table A-16 provides the estimates of QALY losses that result when the “with condition” HRQL is compared to perfect health rather than to average age-adjusted HRQL. As noted earlier and illustrated in Table A-14, the approach that produces the largest estimates of “with condition” HRQL varies due to the differing adjustments used in these comparisons. As expected, the results are larger in the perfect health comparison because perfect health is represented by a constant value of 1.0 across all years of age, while average health declines with age.
Cost-Effectiveness Ratios
Our final step involved reporting the four cost-effectiveness ratios discussed in Chapter 5 of this report, based on the data available for this case study. In these calculations, we use EPA’s estimates of annualized regulatory costs, which are reported as $2.0 billion per year regardless of whether a 3 or 7 percent discount rate is used. For health care treatment costs, we use the per-case medical cost estimates for treatment of chronic bronchitis and nonfatal AMIs provided in Hubbell (2004), which round to $1.1 billion regardless of which discount rate is applied. This estimate will understate total health care cost savings because it excludes other types of costs (such as health care-related time losses) associated with treatment of the conditions.
TABLE A-16 Nonroad Diesel Emissions Case Study: Sensitivity Analysis for QALY Losses
Scenario |
Discount Rate |
EQ-5D Expert Assignment |
EQ-5D MEPS Catalogue |
Transfer from Selected Studies |
Total QALY losses compared to average age-adjusted health |
3% |
119,356 |
108,837 |
114,126 |
7% |
81,395 |
74,349 |
78,086 |
|
Total QALY losses compared to perfect health |
3% |
154,447 |
186,785 |
173,160 |
7% |
104,666 |
125,292 |
116,638 |
In the comprehensive ratio, we net out the value of the benefits not addressed in the effectiveness measure; i.e., the short-lived health impacts and the environmental effects. According to EPA’s analysis, the total value of these additional benefits is about $2.3 billion annually as of the year 2030. In other words, the combined value of these other benefits exceeds the costs of the regulations. Thus netting these benefit values out of the regulatory costs led to negative costs, or savings.
In Table A-17, we report the results for each of the ratios recommended by the Committee. The costs per QALY are less than the costs per life year saved in part because the estimate of costs in the former ratio is lower due to the netting out of medical cost savings. The ratios are within the same order of magnitude across the different approaches used to assess HRQL, and in some cases appear indistinguishable. For the comprehensive ratio, we do not report the results of the calculations because the netting out of other benefits leads to cost savings. All of the cost per QALY estimates would be lower if we used the results of our sensitivity analysis, since the comparison to perfect health yields larger estimates of QALY losses.
Again, this case study does not fully reflect certain of the Committee’s recommendations. While we did not fully assess the distributional or ethical implications of this regulation, Chapter 4 provides an example of a summary of these impacts, and EPA’s analysis provides more detailed information on related topics. In addition, our analysis relies on mean or median values and provides only limited assessment of uncertainty. More extensive uncertainty analysis is required by both the Committee’s recommendations and the existing government-wide guidance. EPA’s BCA provides substantial discussion of this issue, including various assessments of the degree of uncertainty in both the cost and benefit estimates.
In this case study, the experts involved in determining the EQ-5D attributes raised several issues similar to those raised by the experts involved in the FDA and NHTSA studies. These concerns related to the relationship between the disease descriptions and the attribute descriptions, the differences between expert and patient judgments about disease impacts, and the difficulties inherent in considering an “average” or “typical” case rather than an individual patient. As noted earlier, there are a number of steps that analysts can take to develop a more thorough assessment process; e.g., pretesting the approach, working with the experts to ensure that they have a common understanding of the health conditions, index attributes, and the task itself, and following the initial assignment with a process for resolving (or better understanding) any inconsistencies in the results. Relying on patient, rather than expert, assignments was not possible given the time and resources available for this case study, but could significantly alter the findings.
For the other two approaches used in this case study, related uncertain-
TABLE A-17 Nonroad Diesel Emissions Case Study: Cost-Effectiveness Ratios
ties are discussed in the background documents. The MEPS-based EQ-5D analysis (Sullivan et. al, 2005) includes a variety of data that could be used in more formal, quantitative analysis of uncertainty. In applying estimates from the CEA Registry studies, we rely on a single study for each endpoint. However, other studies report varying results for similarly defined health conditions (see Brauer and Neumann, 2005). A more comprehensive approach would consider the full range of values reported; similar, for example, to the approach used in Hubbell (2004).
CONCLUSION
These case studies demonstrated that it is possible to apply a number of approaches to assess the cost-effectiveness of economically significant health and safety regulations. While the Committee was not able to conduct new primary research on the HRQL impacts of the health effects considered, we were able to examine the consequences of applying expert judgment processes and information from different types of existing studies. Although more sophisticated application of these approaches is desirable in the context of actual regulatory analyses, all appear feasible and provide information of interest for decision making.
The case studies also aided us in identifying areas where more research would be useful. For example, the experts involved in the assignment process noted that the generic indexes did not always provide attribute descriptions that were applicable to the health conditions being characterized, and better tailored approaches might be desirable. This was particularly true when the indexes were applied to children. In addition, our review of existing studies in the CEA Registry indicated gaps and inconsistencies in the HRQL values currently available for application to regulatory analysis. Meta-analysis or other approaches that combine results of different studies, as well as additional analysis of uncertainties, also could be helpful. In addition, further development of criteria and best practices for transferring estimates from existing studies would be desirable. We also found that the MEPS catalogue used in the EPA case study was quite useful for this sort of analysis; it provides U.S. population health state index values for a variety of conditions encountered in many regulatory analyses.
The case studies suggested that the types of health risk information available to regulatory analysts pose challenges not necessarily present in clinical outcomes studies or medical technology assessments. In particular, regulatory agencies generally work with risk estimates that reflect small changes in the probability of injury, illness, or death spread throughout a large population. This focus on expected or statistical cases often may require assessing HRQL and longevity impacts for an average or typical
case (or range of cases) of each condition averted by a rule. While some of the health risk information needed to implement a QALY-based CEA is not needed for a BCA, many agencies have developed this additional data in the context of implementing their own approaches to CEA. We faced the most significant data constraints in the NHTSA case study because of the broad injury categories used by that agency. More detailed data on the injuries averted by a particular rule would allow more accurate assessment of HRQL impacts.
The cost-of-illness estimates currently used by the agencies are not entirely compatible with the definition of health treatment costs developed for the reference case by the U.S. Panel on Cost Effectiveness in Health and Medicine (Gold et al., 1996b) and discussed in the Committee’s recommendations. In many cases, these estimates only include direct medical costs. When lost productivity estimates were available, they addressed the long-term impacts of the health condition, not solely the impacts of medical treatment. As noted in the main text of this report, such estimates of lost productivity are likely to double count impacts included in the effectiveness measure, and hence are not suitable for this type of analysis. Development of standard estimating practices for the health care treatment costs to be used in CEA would be useful.
The case studies also provide examples of the implications of a number of the Committee’s recommendations. For instance, the FDA and EPA rules differ significantly in terms of the importance of preventable mortality to the results. For the EPA rule, which averts a relatively large number of deaths, the cost per life year and cost per QALY gained are much more similar than in the case of the FDA rule, which prevents very few deaths. The EPA rule also illustrates the potential for significant changes in the cost-effectiveness measure when other benefits are considered in a comprehensive ratio. Furthermore, the analyses show the importance of comparing “with condition” values to measures of expected actual “without condition” health; comparisons to perfect health lead to estimates of QALY losses that are misleadingly large in some cases.
Finally, we were not able to assess whether alternative HRQL approaches would change regulatory decisions. The final rules used in these case studies lacked information on the impacts of the wide range of regulatory options required by the OMB guidance, so we could not compare the results of different HRQL approaches across regulatory options. However, the cost-per-QALY estimates appear relatively similar across the different HRQL approaches used in the case studies. For example, using a 3 percent discount rate, the range for the health-benefits-only ratio was $13,000 to $18,000 per QALY in the FDA case study, and $7,500 to $8,300 in the EPA case study.
ACKNOWLEDGMENTS
This appendix represents the collaborative efforts of the IOM Committee members, advisers, consultants, and staff with federal agency staff and consultants. The case studies could not have been completed without the exceptional efforts of a great many people. The goal of these studies was to enhance the Committee’s understanding of current practices and of the issues that arise in applying different measures of benefits in a regulatory context, and they were an important source of information and insights that contributed significantly to our deliberations.
In a very real sense, everyone who contributed to these case studies was a volunteer. The scope of effort to produce these analyses exceeded the time and money originally budgeted for the task, and the case study teams worked beyond all original expectations. The Committee is indebted to all those who have contributed their time and expertise to gather information, explain agency policies and practices, and complete a daunting array of analytic tasks. The Committee thanks the following individuals for their advice, generosity, and hard work.
Juice Processing Regulation Case Study
Lead authors: Lisa A. Robinson, Independent Consultant; Wilhelmine Miller, Institute of Medicine; Robert Black, Independent Consultant.
IOM Committee advisers: Alan Garber (lead); Judith Wagner.
Other advisers: Clark Nardinelli, Food and Drug Administration; Sajal Chattopadhyay, Centers for Disease Control and Prevention.
Contributors: John Anderson, University of California, San Diego; Barbara Altman, National Center for Health Statistics; Fred Angulo, Centers for Disease Control and Prevention; Lawrence Deyton, M.D., Veteran’s Administration; Sherine Gabriel, M.D., Mayo Clinic; Janel Hanmer, University of Wisconsin-Madison; William Lawrence, M.D., Agency for Healthcare Research and Quality; Gwen Wanger, M.D., Beth Israel Deaconess Medical Center.
Expert application of generic indexes: Infectious disease—Claire Panosian, M.D., David Geffen School of Medicine, University of California, Los Angeles (UCLA); David A. Pegues, M.D., David Geffen School of Medicine, UCLA; Matthew Leibowitz, M.D., David Geffen School of Medicine, UCLA; Glenn Mathisen, M.D., Olive View-UCLA Medical Center; Sherwood L. Gorbach, M.D., Tufts New England Medical Center; David R. Snydman, M.D., Tufts New England Medical Center; Mark Holodniy, M.D., Veteran’s Administration Palo Alto Health Care System; Victoria
Davey, R.N., M.P.H., U.S. Department of Veterans Affairs. Rheumatology—Lenore Buckley, M.D., Virginia Commonwealth University School of Medicine; Gene G. Hunder, M.D., Mayo Clinic (retired); Eric L. Matteson, M.D., Mayo Clinic College of Medicine; Daniel H. Solomon, M.D., Harvard Medical School; Elizabeth A. Tindall, M.D., Oregon Health and Science University.
Child Restraints Regulation Case Study
Lead authors: Lisa A. Robinson, Independent Consultant; Phaedra Corso, Centers for Disease Control and Prevention; Xiangming Fang, Centers for Disease Control and Prevention; Robert Black, Independent Consultant; Wilhelmine Miller, Institute of Medicine.
IOM Committee advisers: Emmett Keeler (lead); Henry Anderson; Lisa Iezzoni; Alan Krupnick.
Other advisers: Larry Blincoe, National Highway Traffic Safety Administration; Jim Simons, National Highway Traffic Safety Administration; Carmen Brauer, M.D., Harvard School of Public Health.
Contributors: John Anderson, University of California, San Diego; Barbara Altman, National Center for Health Statistics; Nancy Bondy, National Highway Traffic Safety Administration; David Feeny, Kaiser Permanente; Janel Hanmer, University of Wisconsin-Madison; Troy Holbrook, University of California, San Diego; Robert Kaplan, University of California, Los Angeles; William Lawrence, M.D., Agency for Healthcare Research and Quality; Ellen MacKenzie, Ph.D., Johns Hopkins University; Bryce Mason, Rand Corporation; Ted Miller, Pacific Institutes for Research and Evaluation; Ryan Palugod, Institute of Medicine; William Rhoads, Centers for Disease Control and Prevention; Jon Walker, National Highway Traffic Safety Administration.
Expert application of generic indexes: Carmen Brauer, M.D., Harvard School of Public Health; Kristine Campbell, M.D., Children’s Hospital of Pittsburgh; Tim Davis, M.D., Centers for Disease Control and Prevention; Arlene Greenspan, Ph.D., Centers for Disease Control and Prevention; David Mooney, M.D., Children’s Hospital, Boston.
Nonroad Engine Air Emissions Regulation Case Study
Lead authors: Lisa A. Robinson, Independent Consultant; Wilhelmine Miller, Institute of Medicine; Robert Black, Independent Consultant.
IOM Committee advisers: Maureen Cropper (lead); Richard Burnett; James Hammitt; Alan Krupnick.
Other advisers: Carmen Brauer, M.D., Harvard School of Public Health; Bryan Hubbell, U.S. Environmental Protection Agency; Tursynbek Nurmagambetov, Centers for Disease Control and Prevention; Seymour Williams, Centers for Disease Control and Prevention.
Contributors: Adam Atherly, Centers for Disease Control and Prevention; Sarah Brennan, Industrial Economics Incorporated; Jim DeMocker, U.S. Environmental Protection Agency; Chris Dockins, U.S. Environmental Protection Agency; Janel Hanmer, University of Wisconsin-Madison; Fernando Holguin, Centers for Disease Control and Prevention; William Lawrence, M.D., Agency for Healthcare Research and Quality; Darwin LaBarthe, Centers for Disease Control and Prevention; Jim Neumann, Industrial Economics, Incorporated; Peter Neumann, Harvard School of Public Health; Nathalie Simon, U.S. Environmental Protection Agency; Patrick Sullivan, University of Colorado.
Expert application of generic indexes: Respiratory disease—David M. Mannino, M.D., University of Kentucky School of Medicine; Peter Barkin, M.D., Emerson Hospital; R. Graham Barr, M.D., Presbyterian Hospital, Columbia University; Scott D. Ramsey, M.D., Fred Hutchinson Cancer Research Center; Mark J. Utell, M.D., University of Rochester; Roger Yusen, M.D., Washington University School of Medicine. Cardiovascular disease—Harlan M. Krumholz, M.D., Yale Medical School; Russell V. Luepker, M.D., Mayo Clinic; John Rumsfeld, M.D., University of Colorado; Douglas D. Schocken, M.D., University of South Florida; John Spertus, M.D., University of Missouri-Kansas City.