Read "Knowing What Works in Health Care: A Roadmap for the Nation" at NAP.edu

« Previous: 4 Systematic Reviews: The Central Link Between Evidence and Clinical Decision Making

Page 121 Cite

Suggested Citation:"5 Developing Trusted Clinical Practice Guidelines." Institute of Medicine. 2008. Knowing What Works in Health Care: A Roadmap for the Nation. Washington, DC: The National Academies Press. doi: 10.17226/12038.

Page 122 Cite

Page 123 Cite

Page 124 Cite

Page 125 Cite

Page 126 Cite

Page 127 Cite

Page 128 Cite

Page 129 Cite

Page 130 Cite

Page 131 Cite

Page 132 Cite

Page 133 Cite

Page 134 Cite

Page 135 Cite

Page 136 Cite

Page 137 Cite

Page 138 Cite

Page 139 Cite

Page 140 Cite

Page 141 Cite

Page 142 Cite

Page 143 Cite

Page 144 Cite

Page 145 Cite

Page 146 Cite

Page 147 Cite

Page 148 Cite

Page 149 Cite

Page 150 Cite

Page 151 Cite

Page 152 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

5 Developing Trusted Clinical Practice Guidelines Abstract: This chapter reviews the current landscape of clinical practice guideline development in the United States, presents the committee recom- mendations for creating trusted clinical practice guidelines, and describes key challenges in promoting the development and adoption of high-quality guidelines under the aegis of a proposed national clinical effectiveness assessment program (âthe Programâ). Under the status quo, the pro- cesses underlying guideline development are often vulnerable to bias and conflict of interest. Overall, the quality of clinical practice guidelines is often poor. The committee recommends that the Program establish stan- dards for guideline development but also promote voluntary adoption of Program standards by guideline developers. The standards must address the composition of guidelines panels to ensure that guidelines are created by a diversity and balance of competing interests with minimal bias. The standards should also promote objectivity, transparency, and efficiency in guideline development and clear, standardized reporting of clinical recom- mendations. Groups developing clinical practice guidelines should docu- ment their adherence to Program standards and make this documentation publicly available. Individuals and organizations that utilize guideline information would then be in a better position to assess guideline quality and utilize only those guidelines that meet the Programâs standards. The development of clinical practice guidelines for use by practitioners, payers, patients, and others is a key strategy in promoting the use of highly effective clinical services. When they are used, rigorously developed guide- lines have the potential to reduce undesirable practice variation, reduce the use of services that are of minimal or questionable value, increase the 121

122 KNOWING WHAT WORKS IN HEALTH CARE utilization of services that are effective but underused, and target services to those populations most likely to benefit (Grimshaw and Russell, 1993). Underlying the effort to produce evidence-based guidelines is a pressing need for trusted information on clinical effectiveness. As described earlier, in recent years there has been a substantial increase in the number of treatment alternatives available to providers and patients, as well as in the volume of studies describing the effectiveness (or ineffectiveness) of those options. This body of evidence has become complex and difficult to man- age for most providers. As a result, guidelines have become a key tool for summarizing the available literature and placing it in a format accessible to physicians (Druss and Marcus, 2005). This chapter has three principal objectives: (1) to review the current landscape of clinical practice guideline development in the United States, (2) to present the committeeâs recommendations for creating trusted clini- cal practice guidelines, and (3) to highlight key challenges in promoting the development and adoption of high-quality guidelines under the aegis of a proposed national clinical effectiveness assessment program (âthe Programâ). BACKGROUND Clinical practice guidelines attempt to define practices that meet the needs of most patients under most circumstances. They do not attempt to supplant the independent judgment of clinicians in responding to particular clinical situations. Ideally, the specific clinical recommendations that are contained within practice guidelines have been systematically developed by panels of experts who have access to the available evidence, an understand- ing of the clinical problem and the relevant research methods, and sufficient time to absorb the information and make considered judgments (GRADE Working Group, 2004). These panels are expected to be objective and to produce recommendations that are unbiased, up-to-date, and free from conflict of interest. Groups that measure provider performance frequently use adherence to clinical practice guidelines as a basis upon which to evaluate the qual- ity of care, and many payers are now moving toward the use of pay-for- performance strategies that establish differential payments on the basis of adherence to quality measures. In addition to performance-based payment, with the increased use of health information technology and direct decision support at the point of care, guidelines are likely to become increasingly influential in clinical practice (OâMalley et al., 2007). Perhaps the earliest guidelines produced in the United States were the American Academy of Pediatricsâ Redbook of Infectious Diseases, pub-

DEVELOPING TRUSTED GUIDELINES 123 lished in the 1930s (American Academy of Pediatrics, 2007). The groups that were among the first to use systematic reviews to support clinical rec- ommendations were the Canadian Task Force on the Periodic Health Ex- amination and the U.S. Preventive Services Task Force (USPSTF) (Fielding and Briss, 2006). The Canadian task force was established in 1976 to make recommendations about the inclusion of preventive services in the periodic health examination; the USPSTF was established in 1984 and also provided prevention-related recommendations for health profession- als (Woolf and Atkins, 2001). The American College of Physicians began to publish explicit recommendations based on systematic reviews in 1981 (Eddy, 2005). In 1989, Congress established the Agency for Health Care Policy and Research (AHCPR) and tasked it with developing clinical practice guide- lines, among its other responsibilities. The Institute of Medicine (IOM) noted that this effort was part of a cultural shift: a move away from an unexamined reliance on professional judgment and toward more structured support and accountability for these judgments (1990). Before the move toward evidence-based practice, medical textbooks and articles were filled with thousands of statements and care recommendations that were based solely on the belief of the author or at best a consensus of experts (Eddy, 2005). Evidence-based guidelines initiatives aim to base recommendations on empirical evidence. Relationship to Systematic Reviews Clinical guidelines go beyond systematic reviews by recommending what should and should not be done in specific clinical circumstances. Although systematic reviews produce findings about clinical effectiveness, transforming that evidence into specific care recommendations is often chal- lenging. Given the gaps in information that frequently exist and the variable quality of the information that is available, a key component of guideline development is the establishment of a link between the strength of the clini- cal recommendation and the quality of the underlying evidence. Guyatt and colleagues (2006a) argue that one of the first criteria of an effective guideline development process is having two separate grading systems: one for the quality of the evidence and another for the recommen- dations themselves. The quality of evidence grade reflects the level of con- fidence that, if the recommendation is followed, the anticipated outcomes will occur. The strength of the recommendation takes into account the bal- ance of the benefits and the harms that are associated with the intervention and the guideline authorsâ views about the importance of adhering to the recommendation.

124 KNOWING WHAT WORKS IN HEALTH CARE Resource Requirements Guideline production requires a significant commitment by professional societies and others who perform the work, especially if they conduct high- quality systematic reviews themselves. Locating and analyzing all of the available evidence requires substantial skills, resources, and time, and pro- fessional groups often lack what is needed to do a credible job (Woolf et al., 1999). The resource demands of conducting a rigorous systematic review often leads guideline developers to revert to short-cuts or processes centered on expert opinion (Browman, 2001). Moreover, a substantial investment in evidence gathering does not guarantee a good return on evidence available to address a question (Ricci et al., 2006). In fact, guideline developers often must reckon with research that is not sufficiently rigorous, yields conflicting results, or does not exist (Cook and Giacomini, 1999). This also contributes to pressures to rely more heavily on professional opinion. Guideline Developers As described in Chapter 2, many groups produce clinical practice guidelines and recommendations. The National Guideline Clearinghouse (NGC) currently includes guidelines from approximately 360 organiza- tions (NGC, 2007c). Medical professional societies are the most common sponsors of guidelines. In addition, patient advocacy groups, payers, gov- ernment agencies, and others in the United States may conduct systematic reviews and develop clinical recommendations. Organizations in other countries also produce guidelines that are available in the United States, including the National Institute for Health and Clinical Excellence (NICE), the Scottish Intercollegiate Guidelines Network (SIGN), and organizations in Australia and Canada. In the United States, the NGC provides free access to guidelines pro- duced across a range of clinical areas. The NGC included approximately 650 guidelines in 1999 (OâConnor, 2005) and has grown to nearly 2,200 guidelines today (NGC, 2007b). The website now receives an estimated 1.3 million visits per month. For a guideline to be included on the website, guideline producers are required to demonstrate that they performed a systematic literature search and that they developed, reviewed, or revised the guideline within the last five years (NGC, 2007a). By meeting NGC standards and being admitted to the website, guideline developers are able to improve the dissemination of their products. â See Chapter 2 for background on organizations that develop or use clinical practice guidelines. â Personal communication, J. Slutsky, Agency for Healthcare Research and Quality, Septem- ber 4, 2007.

DEVELOPING TRUSTED GUIDELINES 125 The USPSTF, having been in existence for over 20 years, serves as a model of recommendation development in the United States, especially because of its adherence to detailed methodologies and the restrictions it places on conflicts of interest. Clinicians, health plans, and payers have come to rely on the regular reports from the task force to update their prac- tice, payment, or coverage policies regarding clinical preventive services. CURRENT LANDSCAPE Quality of Guidelines The IOM Committee on Clinical Practice Guidelines defined high- quality guidelines as having a number of attributes, including validity, reliability, reproducibility, clinical applicability and flexibility, clarity, devel- opment through a multidisciplinary process, scheduled reviews, and docu- mentation (IOM, 1992). Over time there have been noted improvements in the capacities of some clinical and professional organizations to develop robust, evidence-based guidelines (Jackson and Feder, 1998). Nevertheless, the overall quality of clinical practice guidelines is highly variable, and in fact, the quality is often very poor (Shaneyfelt et al., 1999). Shaneyfelt and colleagues (1999) assessed the quality of 279 guidelines produced over the period of 1985 to 1997 and assessed their quality against a set of 25 stan- dards. The investigators found that the mean number of quality standards satisfied over that period was 11 (43 percent). For example, less than 10 percent of the guidelines described formal methods of combining scientific evidence and expert opinion. The investigators also evaluated the guide- lines in accordance to their specification of purpose (75 percent compli- ance), definition of the patient population involved (46 percent), pertinent health outcomes (40 percent), method of external review (32 percent), and whether an expiration date or scheduled update was included (11 percent). Overall, the investigation found significant improvement over time, but each guideline still only met 50 percent of the standards, on average, in 1997 (Shaneyfelt et al., 1999). For some, this variability in guideline qual- ity called for greater transparency in guideline reporting and more rigorous peer review (Cook and Giacomini, 1999). An evaluation of 86 guidelines developed in 11 countries (which did not include the United States) concluded that the guidelines produced by government-funded agencies and established guideline development pro- grams were of higher quality than guidelines produced by specialty societies (Burgers, 2003). This finding was consistent with the conclusions of Grilli and colleagues (2000), and also with Hasenfeld and Shekelle (2003), who found that the 17 guidelines produced by the AHCPR from 1990 to 1996 were of a substantially higher quality than those subsequently produced

126 KNOWING WHAT WORKS IN HEALTH CARE by other groups. The authors postulated that the higher-quality scores of guidelines developed by government agencies reflect the fact that the production of high-quality guidelines requires substantial and sufficient resources and that government agencies have more resources available to do the work. Smaller professional organizations often lack the internal resources, in- cluding staff capacity and expertise, required to produce guidelines. This is especially true when the organization produces both the systematic reviews and the guideline recommendations, two tasks requiring different skill sets. Even larger professional organizations can face resource constraints in this area. Some have suggested that, given these resource constraints, govern- ment is in the best position to produce clinical practice guidelines (Burgers, 2003; Hasenfeld and Shekelle, 2003). Many of the criticisms directed at the U.S. system of guideline produc- tion in 1990 still apply today (IOM, 1990). These criticisms focused on conflicting clinical recommendations; failure to address certain topics; and incomplete public disclosure of the evidence surveyed, methods used, com- position of the panel, and conflicts of interest. In addition, it remains true that, aside from the role that AHRQ plays in populating the NGC website, no independent entity exists in the United States to certify guideline qual- ity or to develop national standards regulating the content or methods of guideline developers. The 1990 IOM report Clinical Practice Guidelines: Directions for a New Program sought to encourage more standardization and consistency in guideline development, and although the quality of clinical practice guidelines has generally improved since then, substantial inconsistencies in the methodologies and reporting language used still exist (Guyatt et al., 2006b; Shiffman et al., 2003). Quality of Information The translation of systematic reviews into practice recommendations is not straightforward. The same information can be interpreted in different ways by different panelists, resulting in the provision of different guidance (Burgers and van Everdingen, 2004). Often, even when there is substantial consensus about what the scientific evidence says, there are disagreements about what the evidence means for clinical practice. Conclusions about clinical effectiveness can vary widely as a result of conflicting viewpoints, such as which outcomes are the most important and which course of action is appropriate given that the evidence is imperfect (Atkins et al., 2005b). This section highlights strategies that guideline developers have used to improve the reliability and trustworthiness of the information that they provide. It also examines methodological approaches, and how groups

DEVELOPING TRUSTED GUIDELINES 127 have sought to ensure objectivity in their procedures. Finally, this section examines assessments of the overall quality of the recommendations cur- rently being made. Methodological Rigor Although there have been recent efforts to standardize approaches to guideline development, it is not yet possible to say that guideline develop- ment is based on a scientifically validated process. The key challenges stem from the fact that guideline development frequently forces organizations to go beyond what is known from a scientific point of view to make practical recommendations for use in everyday practice. Two examples of such chal- lenges are the approaches to limitations in the evidence base and subjective assessments of the net benefit. Limitations of the evidence baseâ The evidence base that supports clini- cal practice guidelines is often quite limited and guideline developers must often wrestle with what to do when âthe irresistible force of the need to offer clinical advice meets with the immovable object of flawed evidenceâ (Ricci et al., 2006, p. 229). They must consider the best way to address the trade-off between rigor and pragmatism (Browman, 2001), and between adherence to evidence and broader clinical utility (Perlin and Kupersmith, 2007; Stewart et al., 2007). As a result, a consensus of expert opinion among clinical and methodologist panelists often fills in the gaps between areas supported by scientific evidence. In making their treatment decisions, practicing clinicians might want to place less reliance on guidelines that are based primarily on expert opin- ion rather than empirical evidence. Often, however, it is not clear which parts of guidelines are evidence-based, and which are not. Many times, when groups incorporate expert opinion, they do not do so in a standard- ized way (Thomson et al., 1998). The methods for incorporating opinion into guidelines is less well-developed than the methods for incorporating research results and often they are not made explicit. Disclosing the role of expert opinion is especially important when the data are sparse (Cook and Giacomini, 1999). When combining a review of research data with practice recommenda- tions, guidelines often do not identify an explicit search strategy used, do not have defined inclusion criteria for selecting eligible studies, and do not assess the findings against consistent methodological standards (Miller and Petrie, 2000). Guidelines such as these often reflect a subjective assessment of the consistency, clinical relevance, and external validity of the available evidence (Ricci et al., 2006).

128 KNOWING WHAT WORKS IN HEALTH CARE Subjective assessments of net benefitâ The development of clinical recom- mendations should involve a summary of the harms and benefits of a par- ticular service or intervention. The strength of the recommendation reflects this assessment. Table 5-1 illustrates how the USPSTF addresses net benefit in its strength of recommendation categories. Although some bodies of evidence show a high degree of benefit and few harms, in many cases the benefit and harm seem to be more closely balanced and it is much more difficult to justify a strong recommendation. In situations in which the evidence is of poor quality, it may be difficult to come to an agreement about the balance between the benefits and harms (Atkins et al., 2005a). However, even when the data and the evidence are solid, value judgments come into play when making these assessments (Woolf et al., 1999). Rendering judgments about evidence and the subsequent development of appropriate recommendations are complex and the use of some sub- TABLE 5-1â USPSTF Strength of Recommendations Grade Definition Suggestions for Practice A The USPSTF recommends the service. There Offer or provide this service. is high certainty that the net benefit is substantial. B The USPSTF recommends the service. There Offer or provide this service. is high certainty that the net benefit is moderate or there is moderate certainty that the net benefit is moderate to substantial. C The USPSTF recommends against routinely Offer or provide this service only providing the service. There may be if other considerations support the considerations that support providing the offering or providing the service in service in an individual patient. There is at an individual patient. least moderate certainty that the net benefit is small. D The USPSTF recommends against the Discourage the use of this service. service. There is moderate or high certainty that the service has no net benefit or that the harms outweigh the benefits. I The USPSTF concludes that the current Read the clinical considerations evidence is insufficient to assess the balance section of USPSTF Recommenda- of benefits and harms of the service. tion Statement. If the service Evidence is lacking, of poor quality, or is offered, patients should conflicting, and the balance of benefits and understand the uncertainty about harms cannot be determined. the balance of benefits and harms. SOURCE: AHRQ (2007).

DEVELOPING TRUSTED GUIDELINES 129 jectivity in the process is unavoidable (Atkins et al., 2005a). Inconsistent recommendations from different practice guideline development commit- tees often reflect differences in values and tolerances for potential harm. People may perceive the importance of a specific health outcome differently and thus may differ on the point at which the likely benefits of a treatment outweigh the likely harms (IOM, 1990). Guyatt and colleagues (2006a) have indicated that when value or preference judgments are particularly important to the recommendation, guideline development panels should describe the key values attached to these outcomes, and how they influenced the content or strength of the recommendation. Guideline development panels often do not include patients or con- sumers as members, and they may not seek patient input when weighing particular health states (Guyatt et al., 2006a). However, some patient and consumer advocacy groups are taking a more prominent role in the evidence-based health care field, and the concept of shared decision mak- ing has begun to take hold. The use of decision aids is bringing objective information about benefits and harms directly to patients so that they and their physicians can make informed and appropriate decisions (Weinstein et al., 2007). Shared decision making is often the best approach for elective procedures, for example, in deciding whether an arthritic knee hurts enough to justify the risks of knee replacement. Addressing Bias Patients, clinicians, payers, purchasers, and many others rely on having clinical recommendations that are produced in an objective manner. Groups making clinical recommendations have attempted to ensure objectivity in a variety of ways. The following sections examine in detail measures that promote the formation of panels with a balanced composition of members and freedom from conflict of interest. Panel compositionâ To protect against a bias in perspective, it is impor- tant that guideline development panels include individuals from a range of relevant professional groups. Panels that are composed of members from a single specialty are likely to reach conclusions different from those of panels with multispecialty representation, even when both panels are presented with the same set of evidence (Shekelle et al., 1999). Kahan and colleagues (1996) examined six surgical procedures and found that between 10 and 42 percent of all cases that were deemed appropriate by specialists who performed the procedure were deemed less than appropri- ate by primary care providers. Murphy and colleagues (1998) found that members of a specialty are more likely to advocate the use of techniques that involve their specialty. Possible explanations for these systematic dif-

130 KNOWING WHAT WORKS IN HEALTH CARE ferences in judgment include superior knowledge, economic self-interest, and inadvertent cognitive bias (Kahan et al., 1996). To address some of the problems noted, many researchers and others have encouraged the use of balanced, multidisciplinary panels that include representatives from different clinical specialties as well as methodologists and patients. For example, the RAND Corporation-University of California at Los Angeles appropriateness method employs a nine-member multidis- ciplinary panel to assess the appropriateness of specific interventions for specific indications. These panels include specialists who perform the pro- cedure in question, specialists who do not perform the procedure but have practices in related areas (e.g., noninvasive cardiologists for a coronary arteriography panel), and primary care providers (e.g., internists). Shekelle and colleagues (1999) have argued that guideline development panels with multidisciplinary representation may produce more reliable re- sults because such a structure can balance the biases of the various individu- als on the panel. The IOM Committee to Advise the Public Health Service on Clinical Practice Guidelines found that multidisciplinary participation (1) increases the probability that all relevant scientific evidence will be lo- cated and critically evaluated, (2) increases the chances that the committee will address practical problems relating to application of the guidelines, and (3) helps build support among the groups for whom the guideline is intended (IOM, 1990). However, the specific make-up of guideline development panels often remains unaffected by these findings. Grilli and colleagues (2000) exam- ined the guidelines produced by specialty societies and found that only 28 percent mentioned the inclusion of a panelist of a different specialty. Others most often invited to participate included epidemiologists or meth- odologists, primary care physicians, health administrators, and patients or consumer representatives (Grilli et al., 2000). Another study found that only 26 percent of the guidelines examined provided a description of the participants included in the guideline development process along with their areas of expertise (Shaneyfelt et al., 1999). Conflicts of interestâ Actual and perceived conflicts of interest are a ma- jor source of concern for stakeholders seeking objective assessments about clinical effectiveness. These conflicts can occur when decision makersâ including individual clinicians and clinicians serving on guideline develop- ment panelsâhave a personal stake in the outcome of the decision, such â recently formed IOM Committee on Conflict of Interest in Medical Research, Education, A and Practice is studying conflicts of interest in the conduct of medical research, development of practice guidelines, and patient care. A final report is expected in 2009.

DEVELOPING TRUSTED GUIDELINES 131 as a potential financial gain or the loss of intellectual standing (i.e., reputa- tion). In the past several years, these types of conflicts of interest among decision makers have come under increasing scrutiny. Because the interpre- tation of scientific evidence and its translation into clinical decisions often involve the use of a substantial amount of judgment, conflicts of interest add to concerns that bias may be injected into the process. One recent survey of physicians found that 94 percent have some type of relationship with the pharmaceutical industry (Campbell et al., 2007). More than one-third reported that they had received reimbursement for costs associated with professional meetings or continuing medical educa- tion; and 28 percent reported that they had received payments for consult- ing, lecturing, or enrolling patients in clinical trials. For many, there is a persistent concern that these relationships have an undue impact on treat- ment decisions, creating risks for individual patients, and undermining the integrity of the medical profession (Tonelli, 2007). For example, a recent analysis conducted by the New York Times concluded that from 1997 through 2005, Minnesota physicians who received the most money from makers of atypical antipsychotic drugs were more likely to prescribe the drugs to children (Harris et al., 2007). On average, Minnesota psychiatrists who received at least $5,000 from the makers of atypical antipsychotic drugs from 2000 to 2005 wrote three times as many atypical prescriptions for children as psychiatrists who received less or no money, according to the authors. In addition, investigatorsâ public positions on drug safety can be asso- ciated with their financial relationships with pharmaceutical manufactur- ers. For example, investigators who supported the use of calcium-channel antagonists were significantly more likely to have a financial relationship with manufacturers, as compared to those who took a neutral or critical position (90, 60, and 37 percent respectively) (Stelfox, 1998). Conflict of interest is a problem for guideline developers as well. In a survey of clinical practice guideline authors, 59 percent indicated that they had a relationship with companies whose products were included in the guideline that they authored (Choudhry et al., 2002). Aside from equity interest in the companies being evaluated, other types of conflicts include receipt of royalties, speakers fees, consulting fees, and research funding for unrelated products, in addition to various types of intellectual conflicts of interest. The public and Congress have become concerned about perceptions of conflict of interest at both the U.S. Food and Drug Administration (FDA) and the National Institutes of Health (NIH). Both agencies have recently promulgated new guidelines that limit the amount of money that external advisory panel members can receive from companies whose products or

132 KNOWING WHAT WORKS IN HEALTH CARE services may be a focus of their review. In addition, several medical schools have placed restrictions on the access that pharmaceutical sales representa- tives can have to their students. Public disclosure is a highly touted remedy for the biases that may be inherent in conflicted relationships (Choudhry et al., 2002; Stelfox, 1998). For example, Boyd and Bero (2006) recommend the use of specific, de- tailed, and structuredârather than open-endedâforms to solicit as much information as possible about the nature and extent of the conflict. They recommend the disclosure of all financial ties publicly. In cases in which the conflicts appear to be intractable, they recommend that panelists recuse themselves from decisions. However, Jerome Kassirer (2007), former editor-in-chief of the New England Journal of Medicine, argues that transparency measures are insuf- ficient and may actually be harmful because they divert attention away from the more difficult problem, which is protecting the integrity of medical information. Kassirer maintains that there should be a lower threshold of concern about financial conflicts and that although small numbers of con- flicted individuals should be allowed to participate in review panels, they should not be given an opportunity to vote on recommendations pertaining to their financial interests. The challenge, however, is that so many health professionals have conflicts, including those with the greatest expertise. Professional societies are subject to internal and external pressures to support certain practices. Societies may depend on commercial relationships for operating and educational funds. Moreover, specialty societies engaged in market-based competition with each other may publish guidelines that are intended to help them gain ownership of the specific procedures or treatments (Woolf et al., 1999). In addition, individual guideline developers may have a substantial economic or professional stake in the intervention being considered. This has the potential to produce recommendations that ignore or minimize harms or that overestimate the benefit of an interven- tion (Schwartz, 1984). Pluralistic Approach to Guideline Development The current approach to developing clinical recommendations in the United States is highly decentralized. Many different organizations par- ticipate in the process, which allows broad participation by private stake- holders. In addition, rather than having government serve as the primary financier of guideline development, the current system enables the costs to be spread out among multiple parties. Although public sector groups that develop guidelines are less central to the guideline development process than they once were, they still play a significant role. The USPSTF continues to produce recommendations for

DEVELOPING TRUSTED GUIDELINES 133 preventive services that are widely considered to be the âgold standardâ for the process of guideline development (Guirguis-Blake et al., 2007). The task force maintains a rigorous process for contracting with evidence-based practice centers (EPCs) to produce systematic reviews and developing prac- tice recommendations; it sets a high standard for other organizations. The NIH also convenes expert panels to develop clinical recommenda- tions. For example, the National Heart, Lung, and Blood Institute (NHLBI) launched the National Cholesterol Education Program in November 1985 and now sponsors a number of panels that produce guidelines in that area. The NIH Consensus Development Conferences also seek to inform clinical practice, and now contracts with EPCs for systematic reviews of the evi- dence, although they do not produce practice guidelines. Multiple Conflicting Guidelines One of the challenges inherent in having such a decentralized, plu- ralistic process is that often multiple groups produce guidelines in the same clinical topic area. These guidelines may duplicate previous work or produce contradictory findings that may remain unresolved (Woolf et al., 1999). Box 5-1 illustrates a case in which two guideline development panels reviewed largely the same bodies of evidence and reached different conclu- sions about appropriate clinical practice. The magnitude of this challenge is illustrated by the preponderance of guidelines related to hypertension and stroke. The NGC, for example, BOX 5-1 Conflicting Guidelines for the Treatment of Epilepsy Separate panels convened in the United States and the United Kingdom looked at the use of new antiepileptic drugs for the treatment of newly diagnosed epilepsy patients. Although both groups supported the efficacy and safety of the new drugs, they diverged on the appropriate management of these cases. The U.S. panel rec- ommended that either the new drugs or the standard drugs be used (depending on the characteristics of the patient), whereas the U.K. panel was more restrictive, recommending that the new drugs be used only in more narrow circumstances (e.g., cases where the older drug is contraindicated). These discrepancies may be partially explained by the limited amount of information available on the new drugs and the different factors considered by the reviewers (e.g., the U.K. review considered cost and quality of life, but the U.S. review did not). It is also likely that more subjective judgments play a role in the recommendation process. SOURCE: Beghi (2004).

134 KNOWING WHAT WORKS IN HEALTH CARE includes 471 hypertension guidelines and 276 stroke guidelines (NGC, 2007e,f). Anyone looking to ferret out pertinent information faces a sub- stantial sifting process and challenges in determining which of the guidelines are the most relevant and trustworthy. Although the NGC allows readers to compare the guidelines side by side across a number of dimensions, this feature quickly becomes unwieldy as the number of relevant guidelines in- creases. In addition, the guidelines differ substantially in the way that they present information, making it difficult for the reader to compare one set of findings directly against another. For example, guidelines employ differ- ent rating scales to characterize the quality of the supporting evidence (see below). Gap Areas Despite the overabundance of clinical guidance in some topic areas, little guidance exists in other important areas. The following examples il- lustrate how gaps in guideline production may occur: â¢ Some commonly used treatments may not have been examined in systematic reviews, primarily because of a lack of agreement on which professional society âownsâ the condition (e.g., treatments for prostate cancer, which may be âownedâ by the American Urologi- cal Association, the American Society of Clinical Oncology, or the American Society for Therapeutic Radiology and Oncology). â¢ Researchers may avoid doing reviews of treatments for rare and âorphanâ diseases either because the evidence is weak, because no entity is identified as being responsible for developing a guideline, or because there is inadequate financial support to conduct the work. â¢ Some professional societies may not produce guidelines at all be- cause they do not view it as a part of their mission, or they may release clinical position statements that have very little evidentiary basis. â¢ Given the speed at which medicine is changing, guideline production by professional societies may fall behind what is known about new knowledge and technology. Efforts to Improve Guidelines Consensus Building Recognizing that in some clinical areas multiple organizations may seek to develop guidelines, some groups have developed collaborative activities that promote consensus in clinical practice guidelines. For example, the

DEVELOPING TRUSTED GUIDELINES 135 American College of Cardiology (ACC) and the American Heart Asso- ciation (AHA) have jointly produced clinical practice guidelines since the 1980s. Because their guidelines are intended for use by a broad range of health providers, the ACC/AHA writing committees often include repre- sentatives of other organizations, including other groups specializing in the cardiovascular field, such as the American College of Chest Physicians, and other specialties such as the American Academy of Family Practice and the American College of Physicians. In seeking to develop consensus guidelines, the NHLBIâs National Cholesterol Education Program has also developed a partnership of multiple stakeholder groups, which in addition to physicians includes patient-focused groups, such as the American Diabetes Association and others. Voluntary Efforts at Standardization Organizations that produce guidelines conduct their work and commu- nicate their findings in different ways. Evidence-based guideline producers typically provide summary information about key findings including the quality of the individual studies included in the assessment, the quality of the overall body of evidence, and the strength of the recommendations. Each of these components can be depicted in a variety of ways by using let- ters, numbers, symbols, and words (SchÃ¼nemann et al., 2003). For example, Table 5-2 highlights the grading scales that different organizations use to characterize the same cardiology interventions. Although the overall quality of clinical practice guidelines has been im- proved by the efforts that have been made to grade the quality of evidence and the strength of recommendations, according to some the proliferation in the number of grading systems has undermined the value of the grading exercise (Guyatt et al., 2006a). As a result, many people have called for the development of a system that would standardize these grading systems and rating scales. The use of a common approach to grading the strength of recommendations is considered a mechanism that could facilitate the critical appraisal of a guideline development panelâs judgments and aid the interpretation of the benefits and risks of an intervention (Guyatt et al., 2006a; SchÃ¼nemann et al., 2006). Standardization is likely to be difficult, though, because many organizations have invested considerable time and effort in developing unique rating systems and are reluctant to change (Guyatt et al., 2006b). A number of national and international programs use or are develop- ing standardized grading scales within their organizations, including the USPSTF, the United Kingdomâs NICE, and others (SchÃ¼nemann et al., 2006). In addition, the major family medicine journals in the United States have created the Strength of Recommendation Taxonomy, which they be-

136 KNOWING WHAT WORKS IN HEALTH CARE TABLE 5-2â Dueling Evidence Hierarchies and Recommendation Grades in Cardiology Quality of Strength of the Intervention and Organization the Evidence Recommendation Therapy for oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease American Heart Association Level B Class I Scottish Intercollegiate Guidelines Network Level 4 Grade C American College of Chest Physicians Grade C+ Grade 1C+ Implantable Cardioverter-Defibrillator therapy for cardiac arrest due to sustained ventricular fibrillation or ventricular tachycardia American College of Cardiology/American Heart Level A Class I Association Scottish Intercollegiate Guidelines Network Level 3/4 Grade D European Society of Cardiology Level B Class IIa Carotid endarterectomy for internal carotid artery stenosis or symptomatic stenosis American College of Cardiology/American Heart Level C Class IIa Association American Academy of Neurology Class I/II Level A/B Veterans Health Administration Level I Grade A SOURCE: NGC (2007d); SchÃ¼nemann et al. (2003). lieve serves the needs of their specialty. Under that system, evidence from individual studies is rated as Level 1, 2, or 3; bodies of evidence are referred to as consistent or inconsistent; and the strength of recommendations are indicated by the letter A, B, or C (Ebell et al., 2004). In addition to making efforts to reach agreement on grading scales, several groups have sought to standardize guideline development meth- odologies. Although there is still no internationally accepted standard for guideline development, there have been repeated calls for a âguideline for guidelinesâ (Guyatt et al., 2006b; Jackson and Feder, 1998; SchÃ¼nemann et al., 2006; Shaneyfelt et al., 1999; Shekelle et al., 1999; Shiffman et al., 2003). Among the more prominent efforts to standardize and raise the qual- ity of clinical practice guidelines are the Appraisal of Guidelines Research and Evaluation (AGREE) collaboration and the Conference on Guideline Standardization (COGS). The AGREE collaboration defines the quality of guidelines as âthe confidence that the potential biases inherent of guideline development have been addressed adequately and that the recommenda- tions are both internally and externally valid, and are feasible for practiceâ

DEVELOPING TRUSTED GUIDELINES 137 (The AGREE Collaboration, 2001). The AGREE instrument for assessing the quality of clinical practice guidelines is a result of an international col- laboration that originated in Europe in 1998. COGS convened in 2002 to define a set of standards for guidelines. Whereas the AGREE standards were developed as a means by which guide- lines could be externally assessed after completion, the result of COGS was a tool for guideline developers to use as part of their work to improve the quality of their product. The COGS instrument provides a checklist of components necessary for the evaluation of guideline validity and usability (Shiffman et al., 2003). Both the AGREE instrument and the COGS check- list are included in Appendix F. Adherence to Guidelines Overall, the levels of adherence to guidelines and clinical recommenda- tions vary greatly. While some guidelines are widely recognized and used (e.g., recommendations for infant sleeping position), others remain largely unnoticed. In rare instances, guidelines have become the center of media at- tention and controversy, such as mammography screening for breast cancer in women ages 40-49, in which guidelines differ as to whom should receive routine testing. The rate of uptake of guidelines is increasing, but remains quite low. OâMalley and colleagues (2007) found that, over the period 1997 to 2005, the proportion of primary care physicians reporting that guidelines played a significant role in their decision making increased from 16 to 39 percent. Among specialists, these figures increased from 19 to 28 percent. The in- creases reported in the study were attributed to increasing access to health information technology and a greater link between adherence to guidelines and payment. Lack of adherence to guidelines is reflective of the considerable practice variation that exists nationwide and is indicative of the fact that too often medical practice does not reflect much of what is known about effective clinical care (Reinertsen, 2003; Wennberg, 2004). However, the move to- ward performance measurement, pay-for-performance strategies, provider efficiency profiling, and electronic decision support is changing this dy- namic promoting greater accountability for treatment decisions. Limiting Factors The decision to follow practice guidelines is voluntary, limiting the likelihood of universal adoption; as some have noted, guidelines are only guidelines (Cook and Giacomini, 1999). Limited adherence to guidelines

138 KNOWING WHAT WORKS IN HEALTH CARE reflects a number of factors, including a lack of physician awareness, a lack of agreement, and inertia (Cabana et al., 1999). Moreover, there are general concerns regarding the applicability of guidelines at the individual level. Guidelines are meant to define practices that meet the needs of most patients under most circumstances (Hunt et al., 2001). They aggregate the harms and benefits of interventions across a group of patients defined by clinical criteria rather than to individual patients. In addition, they often focus on interventions related to a single condi- tion and individual studies covered by the systematic reviews underpinning the guideline may exclude patients with multiple comorbidities (OâConnor, 2005). Practice guidelines may also apply to only a limited subset of the population and not address the needs of groups such as the elderly (Boyd et al., 2005). And, as described earlier, interpreting multiple guidelines on the same clinical topic may be difficult especially when there is contradic- tory guidance. Local Translation Tierney (2001) argues that guidelines, no matter how well crafted, must undergo âlocal translationâ to be relevant and consistent with local clinical practice standards. However, this type of translation process may lock in some of the local variation that the guidelines are meant to reduce. Generally, to gain wide acceptance, physicians must accept guidelines as best practice (Ayanian et al., 1998; Fried et al., 2006). Yet physicians often do not agree that the standards being promoted through clinical practice guidelines represent the best course of action for their patients (Cabana et al., 1999). In fact, some physicians have accused guidelines of being invalid, unreliable, and irrelevant (Grilli et al., 2000). Guideline Updates Guidelines have limited shelf-lives given the rapid accumulation of new scientific knowledge and changes in practice stemming from new medical technologies and other advances. A review that looked at 17 guidelines published by the AHCPR in the 1990s estimated that about half of the guidelines had become outdated after 5.8 years (Shekelle et al., 2001). The authors concluded that guidelines be reassessed for their validity every 3 years. To stay current, the organizations that issue guidelines must monitor the medical literature and be prepared to update the guideline. This stan- dard is currently enforced by the NGC, which will not retain the guidelines in its database unless they have been developed, reviewed, or revised within the last 5 years.

DEVELOPING TRUSTED GUIDELINES 139 CRITICAL PROGRAM CHALLENGES AND RECOMMENDATIONS Building on the Current System Efforts to improve the quality and availability of clinical practice guide- lines need not involve a wholesale restructuring of the current system. The recommendations proposed by the committee build on the aspects of the current system that are functioning wellâincluding the work of the USPSTF, the ACC/AHA, and othersâbut seek to raise the standards for producers of clinical practice guidelines overall. Building on the current system is practical for a number of reasons. First, the experience of the AHCPR in the 1990s exposed the significant po- litical risks involved in establishing government-sponsored clinical practice guidelines. When an AHCPR Patient Outcomes Research Team developed a guideline on the treatment of back pain, an angry group of orthopedic surgeons almost succeeded in convincing Congress to defund the agency (Gray, 1992; Gray et al., 2003). In addition, the private organizations that currently produce guidelines, such as professional societies and others, trea- sure their autonomy and would likely oppose efforts to reduce their role. Furthermore, guidelines that have the imprimatur of a respected profes- sional society engender trust by the end users (Tunis et al., 1994). Finally, there are some indications that the quality of these guidelines has improved over time (Jackson and Feder, 1998), although data need to be updated. For these reasons, the committee believes that the pragmatic approachâand also the most promising approachâis to build on the current system. Common Standards Clinical practice guidelines vary widely in their methodological rigor and protection from bias; however, in the current environment, the organi- zations and individuals who use guidelines have very limited means to as- sess their objectivity or accuracy. The committee recommends several steps to ensure that the information communicated through practice guidelines is trustworthy. Recommendation: Groups developing clinical guidelines or recommen- dations should use the Programâs standards, document their adherence to the standards, and make this documentation publicly available. The committee recommends that guideline development organizations adhere to a common set of standards that address the structure, process, reporting, and final product that contains the guidelines. Ensuring adher- ence to these standards, in part through public disclosure of adherence data,

140 KNOWING WHAT WORKS IN HEALTH CARE will increase the quality and accuracy of guidelines, as well as end usersâ confidence in adopting them. Thus, common standards will contribute to the overall success of the Program. Although a number of consensus approaches to guideline standardiza- tion currently exist, there is not agreement on a single set of standards. The Program should develop (or endorse) standards for creating clinical practice guidelines, either by convening a panel of experts or by commissioning an outside group to perform this work. Below are a number of key standards that the committee believes will be important. Standards of Critical Importance Objectivityâ Central to the development of effective guidelines is ensuring that the process is performed in an objective and impartial manner and that the conclusions are objective and impartial. Instituting balanced panel participation and governance will help to ensure that the clinical guidance that is produced is trustworthy. A more detailed discussion of the manage- ment of bias in guideline production follows later in this chapter. Transparencyâ An important mechanism to promote trust is having a pro- cess that is open to the public. Conflicts of interest that may exist at the level of the panelist, panel, or sponsoring organization should be publicly disclosed. Deliberations that are open to the public and encourage public participation will ensure that a wide variety of perspectives are considered. Posting draft guidelines for public comment can also help achieve a greater balance of viewpoints. The methods that the panel employs to gather, assess, and weigh evi- dence, as well as the mechanism that it uses to grade the strength of recom- mendations, should be explicitly defined, consistently applied, and available for public review. Of particular salience is the need to standardize, to the extent possible, the methods that the panels apply in the face of insufficient evidence. Efficiency and timelinessâ As the work of guideline producers becomes strategically aligned with national needs for improved information at the point of care, it is crucial that the work of these organizations be carried out in a timely fashion. Currently, patients and providers often make decisions in the absence of guidance in cases in which evidence reviews and practice guidelines have not been completed. Likewise, health plans and purchasers must make rapid coverage decisions regardless of whether or not guidelines are available. To increase the volume of work that the system can produce overall, guideline developers should take steps to avoid the unnecessary duplica-

DEVELOPING TRUSTED GUIDELINES 141 tion of effort and should deploy limited resources effectively. The Program should play a role in improving coordination among these activities. Im- provements in cross-organizational efficiency can help ensure that tech- nologies, procedures, and interventions are evaluated in a more timely fashion. External reviewâ Peer review conducted by outside experts is an important measure that can help ensure the quality of the guidelines produced. Groups that develop guidelines should institute a peer review process, in addition to allowing stakeholders and the public to review draft guidance. Organi- zations should adopt processes that include independent oversight of their responses to the peer review comments to ensure that they are responding appropriately to well-supported criticism. Currencyâ Guidelines have limited shelf-lives as a result of the expanding evidence base and the corresponding changes in how medicine is practiced. Monitoring the medical literature for new evidence is an obligation of guideline developers. Organizations should not develop guidelines unless they are willing to keep them up to date. Overlaps and gapsâ To meet the needs of key decision makers, guideline producers must be aware that the conclusions offered by various recom- mending groups may often conflict. Voluntary efforts to promote consensus in specific topic areas should be a high priority. An important role for the Program will be identifying conflicts and convening efforts to resolve them. Groups that develop guidelines should be willing to participate in the rec- onciliation of their work with that of other groups when the need arises. In addition, processes for identifying and addressing the absence of guidelines for rare diseases and other clinical areas should also be established. Common Language The committee believes that a common language that expresses the strength of clinical practice recommendations should be an essential feature of the guideline development and reporting process. The use of a common language for all clinical practice recommendations is an efficient way to communicate the strength of evidence and assist end users with assessing the outputs of the various organizations that produce guidelines. This com- mon language should convey the same information about the strength of evidence irrespective of the clinical service under consideration. In other words, guideline developers must use the same terms to describe the same quality of evidence for all clinical services. Judgment about the strength of a recommendation derives from con-

142 KNOWING WHAT WORKS IN HEALTH CARE sideration of the benefits anticipated if the recommendation is followed and consideration of the potential harms and costs of such adherence. Strong recommendations are made when the benefits clearly exceed the harms or when the harms clearly exceed the benefits. On the contrary, lower-level recommendations (sometimes referred to as clinical âoptionsâ) are made when the balance of the anticipated benefits compared with the anticipated harms and costs is less clear-cut or is essentially equivalent. As mentioned above, the statement of the strength of the recommenda- tion communicates an expectation regarding adherence. Whereas clinicians should be expected to follow strong recommendations unless a clear and compelling rationale for not doing so is present, patient preferences should also have a substantial role in influencing clinical decision making, and may even sometimes choose not to proceed with an intervention that has been found to be strongly beneficial. In addition, pay-for-performance measures should be built from strong recommendations and not clinical options. The quality of the evidence (based on factors related to minimizing bias such as study design, consistency, and directness of the evidence) helps determine the confidence that should be placed in the balance equa- tion. The guideline developer can confidently and strongly recommend an intervention when it is found effective and with minimal adverse effects in multiple, well-designed studies. Under such circumstances, one can be confident of the importance of adherence to the guideline. On the other hand, when high-quality evidence indicates both important benefits and important harms, the recommendation should be made accordingly. Strong recommendations should not be created when the evidence is poor. Nor should high-quality evidence on effectiveness automatically lead to a strong recommendation; potential harm must also be considered. The committee believes that a common language that describes both the quality of the underlying evidence and the strength of recommendations is an important tool for promoting greater consistency among clinical practice guideline developers. This common language will reduce the requirements placed on end users in sorting through and navigating all the various terms, symbols, and expressions that currently exist. An important task for the Program will be to facilitate the process of achieving a common language. Minimizing Bias Due to Conflicts of Interest Organizations that produce guidelines convene panels of experts to assess the available evidence and develop clinical recommendations. To produce objective, well-balanced, reliable clinical guidance, guideline de- velopers must address several basic structural issues that, if they are not managed properly, can create the perceptionâif not the realityâof bias.

DEVELOPING TRUSTED GUIDELINES 143 These can diminish the value of the clinical information provided and can undermine public confidence in the guidelines overall. Unbiased Information Bias can enter into the guideline development process in a number of ways, as illustrated in Box 5-2. These biases can occur at the individual, panel, and organizational levels. Groups and organizations that develop clinical practice guidelines should address each of these to ensure that the end users view their guidelines as credible and trustworthy. The committee identified and compared three approaches for handling conflicts of interest. The first is a permissive approach by which guideline producers are able to address these issues as they fit on a case-by-case ba- sis. The current system is relatively permissive in how it handles potential conflicts, although some measures are in place to limit influence, such as restrictions on money or other gifts received from commercial sources that are placed on external advisors by the FDA and the NIH. Investigative reporting conducted by the print media has called into question guideline producers that have received industry financing, and this has raised public awareness. In addition, financial disclosures have increasingly been em- ployed to address conflict of interest, and while this is an important step, its effectiveness may be limited. Moreover, guideline developers do not al- ways disclose potential conflicts and biases that may exist at the individual, panel, or organizational level. BOX 5-2 Potential Sources of Bias in Clinical Practice Guidelines â¢ Panelists have material interests in the recommendations, e.g., stock owner- ship, royalties, or other returns. â¢ Panelists have indirect financial interests, e.g., they could be paid for the health service under review or receive honoraria for discussing it. â¢ The panel is primarily made up of individuals from one specialty with only lim- ited participation by other types of providers, patients, plans, methodologists, etc. â¢ Panelists have intellectual biases, e.g., prior research, strongly held opinions, or professional specialty that might compromise oneâs objectivity or bring it into question. â¢ The organization producing the guideline receives funding from companies with a material interest in the recommendations. â¢ The panel does not allow participation from members of the public. â¢ Panels do not allow participation from members of the public.

144 KNOWING WHAT WORKS IN HEALTH CARE The second approach is to promote a system that is completely free from conflict of interest. Under such a system, panelists who had received any remuneration from affected industries would be disqualified from serv- ing on guideline development panels. The restriction would also be applied to physicians who receive fee-for-service compensation for doing the proce- dures being reviewed. Presumably salaried physicians or physicians who do not provide the procedure themselves would qualify as panelists. Intellectual conflicts of interest, such as those relating to professional reputation, would also have to be addressed. At the organizational level, groups sponsored or convened by potentially affected manufacturers (including professional societies, consumer advocacy groups, and others) would not be recognized as appropriate sponsors of clinical practice guidelines. Given the extent to which these types of conflicts exist in the current environment, the second âpureâ model seems largely impractical. Guideline producers require panelists who have expertise and, in todayâs environment, these experts typically have conflicts of interest. Therefore, the committee identified a third, more pragmatic approach that recognizes that conflicts of interest are likely to persist for members of most guideline production panels, but that a number of steps can be taken to manage these conflicts. These steps include placing limits on the financial remuneration that panel- ists and organizations receive, balancing the composition of panels, and establishing a transparent process that includes public participation. Table 5-3 illustrates measures that might be taken under each of the TABLE 5-3â Measures to Address Conflicts of Interest Approach Panelist Panel Organization Permissive â¢ Discretion of guideline â¢ Discretion of â¢ Discretion of producer guideline producer guideline producer Pure â¢ remuneration from No â¢ Balanced panel â¢ Guideline-producing affected manufacturers participation organization â¢ Individuals with conflicts (various provider/ receives no restricted to brief panel stakeholder types, payments presentations including plans, from affected patients, and others) manufacturers Pragmatic â¢ Limited remuneration from â¢ Balanced panel â¢ Guideline-producing affected manufacturers participation organization â¢ Disclosure of conflicts (various provider/ receives limited â¢ Publication of disclosed stakeholder types, payments conflicts (transparency) including plans, from affected â¢ Limited voting rights for patients, and others) manufacturers members with conflicts â¢ Open (public) meetings

DEVELOPING TRUSTED GUIDELINES 145 three approaches (permissive, pure, and pragmatic), across the various levels of the guideline production process (individual panelist, entire panel, and sponsoring organization). The permissive and the pure categories are intended to represent the extreme ends of the spectrum. The committee maintains that the pragmatic approach is the most ap- propriate course of action, given that the current, more permissive approach provides too few safeguards against conflicts of interest and bias and that the âpureâ approach, although it is theoretically desirable, is impractical and would strip too much expertise from the guideline development pro- cess. However, the Program, as detailed in Chapter 6, should develop (or endorse) strict standards to protect against bias and ensure that clinical practice guideline producers are adhering to these standards. In particular, the committee identified the following measures as a means of improving the quality of the information provided by guideline developers: Recommendation: To minimize bias due to conflicts of interest, panels should include a balance of competing interests and diverse stakehold- ers, publish conflict of interest disclosures, and prohibit voting by members with material conflicts. Individual Level At the individual level, guideline developers should vet potential pan- elists for financial or intellectual biases and should have panelists disclose all financial relationships with commercial companies. They should also reveal any relevant positions that the panelists have advocated publicly. The Program should establish parameters to indicate when personal conflicts of interest are significant enough to warrant disqualification from panel participation or voting. Panel Level At the panel level, groups should be multidisciplinary and should in- clude topic experts, generalists, consumers, payers (e.g., health plans), and others. For example, recommendations on care for children with Attention Deficit Hyperactivity Disorder should be developed with representation from pediatrics, family practice nursing, psychiatry and behavioral medi- cine, educators, parent organizations, and payers. Organizations should seek to build a broad consensus about the treatment alternatives that fall within the scope of the review. Purchasers and health plan representatives should be included on guideline development panels to moderate any clini- cal or manufacturer bias in favor of greater service utilization. In general,

146 KNOWING WHAT WORKS IN HEALTH CARE the panel should represent a balance of competing interests to the greatest extent possible. Organizational Level At the organizational level, guideline developers should disclose the monies they have received from affected manufacturers, either related to the subject of guideline development or any general contributions. The standards generated by the Program should establish the levels of commer- cial involvement or support at which organizations should be considered insufficiently protected from commercial bias. Adherence to Standards The end users of clinical guidelines and recommendationsâphysicians, performance measurement groups, health plans, purchasers, patients, policy makers, and othersâwould benefit from a rigorous set of development standards and a common reporting language that would improve the qual- ity and usability of the guidelines. However, achieving this objective will be difficult. Various groups have developed distinct ways to speak about and assess evidence, and to grade the strength of their recommendations. More- over, the rigors of their processes are highly variable; and many guideline developers do not have the resources or the ability to meet a set of structure, process, and product standards that are externally imposed. Nevertheless, ensuring the quality and the usability of the information provided in clinical practice guidelines is vital to the performance of the health system and there is a need to promote compliance with these new guideline standards. Given the impracticality of centralizing the guideline development process in the U.S. government, the committee believes that building on the current pluralistic system is the most appropriate course of action. The committee proposes that the users of guidelines serve as the primary arbiters of guideline quality, with guideline developers volun- tarily providing documentation that will allow end users to make informed judgments. Recommendation: Providers, public and private payers, purchasers, accrediting organizations, performance measurement groups, patients, consumers, and others should preferentially use clinical recommenda- tions developed according to the Program standards. The committee envisions that the Program will develop a common reporting mechanism that will enable guideline developers to describe the features of their process. Through a standardized survey instrument, guide-

DEVELOPING TRUSTED GUIDELINES 147 line developers will be asked to report on their methodologies and their adherence to the common standards. Guideline-producing organizations, spurred by end users, will want to include this documentation as part of the guideline itself. In addition, the information will be uploaded to a public web page to enable end users to view the extent to which organizations producing guidelines (and the guidelines themselves) adhere to the common standards. If guideline users preferentially adopt guidelines that are developed according to Program standards, guideline producers will be motivated to adhere to those stan- dards and to provide documentation about their processes. The availability of this information will enable the end users of the guideline material to become more informed about the quality of the infor- mation they receive. Although the documentation that guideline develop- ers provide may not be complete, gaps in that information may serve as a red flag for the end users. In addition, through increased transparency and openness in the guideline development process, the accuracy of the informa- tion reported will be more easily verified. The Program may want to consider instituting a certification or accredi- tation process to assure that specific guidelines or organizations developing them adhere to specific standards. Such an accreditation or certification process would allow for continued decentralized guideline production. The end users of guideline information then become the group that holds the guideline developers accountable for their work products. The vi- sion of the committee is that performance measurement groups will primar- ily rely on the highest-quality informationâas indicated by the standards reporting documentâand establish measures that will encourage physicians to comply with these high-quality recommendations. Health plans and purchasers should also be selective in choosing only guidelines that adhere to standards and base performance-based programs only on these types of guidelines. Accreditation groups (e.g., the National Committee for Quality Assurance, the Utilization Review Accreditation Commission, and the Joint Commission) should assess the extent to which the groups that they moni- tor are relying on the highest-quality clinical practice guideline information. Through this mechanism, the committee believes that improvements in guideline quality and clinical effectiveness information can be achieved. REFERENCES The AGREE Collaboration. 2001. The Appraisal of Guidelines for Research and Evaluation (AGREE) instrument. London, UK: The AGREE Research Trust http://www.agreetrust. org/docs/AGREE_Instrument_English.pdf (accessed September 2007). AHRQ (Agency for Healthcare Research and Quality). 2007. U.S. Preventive Services Task Force ratings http://www.ahrq.gov/clinic/uspstf07/ratingsv2.htm (accessed September 14, 2007).

148 KNOWING WHAT WORKS IN HEALTH CARE American Academy of Pediatrics. 2007. History of the Red BookÂ® http://aapredbook. aappublications.org/about/#hist (accessed October 23, 2007). Atkins, D., P. A. Briss, M. Eccles, S. Flottorp, G. Guyatt, R. T. Harbour, S. Hill, R. Jaeschke, A. Liberati, N. Magrini, J. Mason, D. OâConnell, A. D. Oxman, B. Phillips, H. SchÃ¼nemann, T. T. Edejer, G. E. Vist, R. D. Williams, and the GRADE Working Group. 2005a. Systems for grading the quality of evidence and the strength of recommendations II: Pilot study of a new system. BMC Health Services Research 5(1). Atkins, D., J. Siegel, and J. Slutsky. 2005b. Making policy when the evidence is in dispute. Health Affairs 24(1):102-113. Ayanian, J. Z., M. B. Landrum, S.-L. T. Normand, E. Guadagnoli, and B. J. McNeil. 1998. Rating the appropriateness of coronary angiographyâDo practicing physicians agree with an expert panel and with each other? New England Journal of Medicine 338(26): 1896-1904. Beghi, E. 2004. Efficacy and tolerability of the new antiepileptic drugs: Comparison of two recent guidelines. Lancet Neurology 3(10):618-621. Boyd, C. M., J. Darer, C. Boult, L. P. Fried, L. Boult, and A. W. Wu. 2005. Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: Implica- tions for pay for performance. JAMA 294(6):716-724. Boyd, E. A., and L. A. Bero. 2006. Improving the use of research evidence in guideline develop- ment: 4. Managing conflicts of interests. Health Research Policy and Systems 4(16). Browman, G. P. 2001. Development and aftercare of clinical guidelines: The balance between rigor and pragmatism. JAMA 286(12):1509-1511. Burgers, J. 2003. Characteristics of high-quality guidelines. International Journal of Technol- ogy Assessment in Health Care 19(1):148-157. Burgers, J. S., and J. J. van Everdingen. 2004. Beyond the evidence in clinical guidelines. The Lancet 364(9432):392-393. Cabana, M. D., C. S. Rand, N. R. Powe, A. W. Wu, M. H. Wilson, P.-A. C. Abboud, and H. R. Rubin. 1999. Why donât physicians follow clinical practice guidelines?: A frame- work for improvement. JAMA 282(15):1458-1465. Campbell, E. G., R. L. Gruen, J. Mountford, L. G. Miller, P. D. Cleary, and D. Blumenthal. 2007. A national survey of physician-industry relationships. New England Journal of Medicine 356(17):1742-1750. Choudhry, N. K., H. T. Stelfox, and A. S. Detsky. 2002. Relationships between authors of clinical practice guidelines and the pharmaceutical industry. JAMA 287(5):612-617. Cook, D., and M. Giacomini. 1999. The trials and tribulations of clinical practice guidelines. JAMA 281(20):1950-1951. Druss, B. G., and S. C. Marcus. 2005. Growth and decentralization of the medical literature: Implications for evidence-based medicine. Journal of the Medical Library Association 93(4):499-501. Ebell, M., J. Siwek, B. D. Weiss, S. H. Woolf, J. Susman, B. Ewingman, and M. Bowman. 2004. Strength of Recommendation Taxonomy (SORT): A patient-centered approach to grading evidence in medical literature. American Family Physician 69(3):548-556. Eddy, D. M. 2005. Evidence-based medicine: A unified approach. Health Affairs 24(1):9-17. Fielding, J. E., and P. A. Briss. 2006. Promoting evidence-based public health policy: Can we have better evidence and more action? Health Affairs 25(4):969-978. Fried, M., M. Farthing, J. Krabshuis, and E. Quigley. 2006. Global guidelines: Is gastroenterol- ogy leading the way? Lancet 368(9552):2041-2042. GRADE Working Group. 2004. Grading quality of evidence and strength of recommenda- tions. BMJ 328(7454):1490. Gray, B. H. 1992. The legislative battle over health services research. Health Affairs 11(4): 38-66.

DEVELOPING TRUSTED GUIDELINES 149 Gray, B. H., M. K. Gusmano, and S. Collins. 2003. AHCPR and the changing politics of health services research. Health Affairs w3.283. Grilli, R., N. Magrini, A. Penna, G. Mura, and A. Liberati. 2000. Practice guidelines developed by specialty societies: The need for critical appraisal. Lancet 355:103-106. Grimshaw, J. M., and I. T. Russell. 1993. Effect of clinical guidelines on medical practice: A systematic review of rigorous evaluations. Lancet 342:1317-1322. Guirguis-Blake, J., N. Calonge, T. Miller, A. Siu, S. Teutsch, and E. Whitlock for the U.S. Preventive Services Task Force 2007. Current processes of the U.S. Preventive Services Task Force: Refining evidence-based recommendation development. Annals of Internal Medicine 0000605-200707170-200700170. Guyatt, G., D. Gutterman, M. H. Baumann, D. Addrizzo-Harris, E. M. Hylek, B. Phillips, G. Raskob, S. Z. Lewis, and H. SchÃ¼nemann. 2006a. Grading strength recommendations and quality of evidence in clinical guidelines: Report from an American College of Chest Physicians task force. Chest 129(1):174-181. Guyatt, G., G. Vist, Y. Falck-Ytter, R. Kunz, N. Magrini, H. SchÃ¼nemann, and R. Elena. 2006b. An emerging consensus on grading recommendations? ACP Journal Club A8-A9. Harris, G., B. Carey, and J. Roberts. 2007. Psychiatrists, children and drug industryâs role. The New York Times, âHealthâ http://www.nytimes.com/2007/05/10/health/10psyche. html?ei=5070&en= a90e19408d5df0cd&ex=1188964800&adxnnl=1&adxnnlx=11888 45745-KhpszR5rlQnw7Dxp8FR0Pw (accessed May 10, 2007). Hasenfeld, R., and P. G. Shekelle. 2003. Is the methodological quality of guidelines declin- ing in the U.S.? Comparison of the quality of U.S. Agency for Health Care Policy and Research (AHCPR) guidelines with those published subsequently. Quality and Safety in Health Care 12(6):428-434. Hunt, S. A., D. W. Baker, M. H. Chin, M. P. Cinquegrani, A. M. Feldmanmd, G. S. Francis, T. G. Ganiats, S. Goldstein, G. Gregoratos, M. L. Jessup, R. J. Noble, M. Packer, M. A. Silver, L. W. Stevenson, R. J. Gibbons, E. M. Antman, J. S. Alpert, D. P. Faxon, V. Fuster, G. Gregoratos, A. K. Jacobs, L. F. Hiratzka, R. O. Russell, and S. C. Smith, Jr. 2001. ACC/AHA guidelines for the evaluation and management of chronic heart failure in the adult: Executive summary a report of the American College of Cardiology/ American Heart Association Task Force on Practice Guidelines (Committee to Revise the 1995 Guidelines for the Evaluation and Management of Heart Failure). Circulation 104(24):2996-3007. IOM (Institute of Medicine). 1990. Clinical practice guidelines: Directions for a new program. Edited by Field, M. J., and K. N. Lohr. Washington, DC: National Academy Press. âââ. 1992. Guidelines for clinical practice: From development to use. Edited by Field, M. J., and K. N. Lohr. Washington, DC: National Academy Press. Jackson, R., and G. Feder. 1998. Guidelines for clinical guidelines. BMJ 317(7156):427-428. Kahan, J. P., R. E. Park, L. L. Leape, S. J. Bernstein, L. H. Hilborne, L. Parker, C. J. Kamberg, D. J. Ballard, and R. H. Brook. 1996. Variations by specialty in physician ratings of the ap- propriateness and necessity of indications for procedures. Medical Care 34(6):512-523. Kassirer, J. P. 2007. Chapter 7: Medicineâs obsession with disclosure of financial conflicts: Fix- ing the wrong problem. In Science and the media: Delgadoâs brave bulls and the ethics of scientific disclosure. Edited by Snyder, P. J., L. Mayes, and D. Spencer. San Diego, CA: Academic Press, an imprint of Elsevier, Inc. Miller, J., and J. Petrie. 2000. Development of practice guidelines. Lancet 355(9198):82-83. Murphy, M. K., N. A. Black, D. L. Lamping, C. M. McKee, C. F. B. Sanderson, J. Askham, and T. Marteau. 1998. Consensus development methods, and their use in clinical guideline development. Health Technology Assessment 2(3):i-88. NGC (National Guideline Clearinghouse). 2007a. About inclusion criteria http://www. guideline.gov/about/inclusion.aspx (accessed August 17, 2007).

150 KNOWING WHAT WORKS IN HEALTH CARE âââ. 2007b. Guideline index http://www.guideline.gov/browse/guideline_index.aspx (ac- cessed September 14, 2007). âââ. 2007c. NGC browseâorganizations http://www.guideline.gov/browse/browseorgsbyLtr. aspx?Letter=* (accessed June 2, 2007). âââ. 2007d. Search for cardiology http://www.guideline.gov/search/searchresults.aspx? Type=3&txtSearch=cardiology&num=500 (accessed July 11, 2007). âââ. 2007e. Search for hypertension http://www.guideline.gov/search/searchresults.aspx? Type=3&txtSearch=hypertension&num=500 (accessed August 12, 2007). âââ. 2007f. Search for stroke http://www.guideline.gov/search/searchresults.aspx?Type=3 &txtSearch=stroke&num=500 (accessed August 12, 2007). OâConnor, P. J. 2005. Adding value to evidence-based clinical guidelines. JAMA 294(6): 741-743. OâMalley, A. S., H. H. Pham, and J. D. Reschovsky. 2007. Predictors of the growing influence of clinical practice guidelines. Journal of General Internal Medicine 22(6):742. Perlin, J. B., and J. Kupersmith. 2007. Information technology and the inferential gap. Health Affairs 26(2):w192-w194. Reinertsen, J. L. 2003. Zen and the art of physician autonomy maintenance. Annals of Internal Medicine 138(12):992-995. Ricci, S., M. G. Celani, and E. Righetti. 2006. Development of clinical guidelines: Method- ological and practical issues. Neurological Sciences 27(3):S228-S230. SchÃ¼nemann, H. J., D. Best, G. Vist, A. D. Oxman, and the GRADE Working Group. 2003. Letters, numbers, symbols and words: How to communicate grades of evidence and recommendations. Canadian Medical Association Journal 169(7):677-680. SchÃ¼nemann, H. J., A. Fretheim, and A. D. Oxman. 2006. Improving the use of research evidence in guideline development: 9. Grading evidence and recommendations. Health Research Policy and Systems 4(21). Schwartz, J. S. 1984. The role of professional medical societies in reducing practice variations. Health Affairs 3(2):90-101. Shaneyfelt, T. M., M. F. Mayo-Smith, and J. Rothwangl. 1999. Are guidelines following guide- lines?: The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. JAMA 281(20):1900-1905. Shekelle, P. G., S. H. Woolf, M. Eccles, and J. Grimshaw. 1999. Clinical guidelines: Developing guidelines. BMJ 318(7183):593-596. Shekelle, P. G., E. Ortiz, S. Rhodes, S. C. Morton, M. P. Eccles, J. M. Grimshaw, and S. H. Woolf. 2001. Validity of the Agency for Healthcare Research and Quality clinical practice guide- lines: How quickly do guidelines become outdated? JAMA 286(12):1461-1467. Shiffman, R. N., P. Shekelle, M. Overhage, J. Slutsky, J. Grimshaw, and A. M. Deshpande. 2003. Standardized reporting of clinical practice guidelines: A proposal from the Confer- ence on Guideline Standardization. Annals of Internal Medicine 139(6):493-500. Stelfox, H. T. 1998. Conflict of interest in the debate over calcium-channel antagonists. New England Journal of Medicine 338(2):101-106. Stewart, W. F., N. R. Shah, M. J. Selna, R. A. Paulus, and J. M. Walker. 2007. Bridging the inferential gap: The electronic health record and clinical evidence. Health Affairs 26(2): w181-w191. Thomson, R., H. McElroy, and M. Sudlow. 1998. Guidelines on anticoagulant treatment in atrial fibrillation in Great Britain: Variation in content and implications for treatment. BMJ 316(7130):509-513. Tierney, W. M. 2001. Improving clinical decisions and outcomes with information: A review. International Journal of Medical Informatics 62:1-9. Tonelli, M. R. 2007. Conflict of interest in clinical practice. Chest 132(2):664-670.

DEVELOPING TRUSTED GUIDELINES 151 Tunis, S. R., R. S. A. Hayward, M. C. Wilson, H. R. Rubin, E. B. Bass, M. Johnston, and E. P. Steinberg. 1994. Internistsâ attitudes about clinical practice guidelines. Annals of Internal Medicine 120(11):956-963. Weinstein, J. N., K. Clay, and T. S. Morgan. 2007. Informed patient choice: Patient-centered valuing of surgical risks and benefits. Health Affairs 26(3):726-730. Wennberg, J. E. 2004. Perspective: Practice variations and health care reform: Connecting the dots. Health Affairs var.140. Woolf, S. H., and D. Atkins. 2001. The evolving role of prevention in health care: Contribu- tions of the U.S. Preventive Services Task Force. American Journal of Preventive Medicine 20(3, S1):13-20. Woolf, S. H., R. Grol, A. Hutchinson, M. Eccles, and J. Grimshaw. 1999. Clinical guidelines: Po- tential benefits, limitations, and harms of clinical guidelines. BMJ 318(7182):527-530.

Next: 6 Building a Foundation for Knowing What Works in Health Care »

Knowing What Works in Health Care: A Roadmap for the Nation (2008)

Chapter: 5 Developing Trusted Clinical Practice Guidelines

Welcome to OpenBook!

Get Email Updates