Trustworthy Clinical Practice Guidelines: Challenges and Potential
Abstract: This chapter examines ongoing challenges surrounding the current clinical practice guideline (CPG) development process that diminish the quality and trustworthiness of guidelines for clinicians and the public. These challenges include limitations in the scientific evidence on which CPGs are based, lack of transparency of development groups’ methodologies, questions about how to reconcile conflicting guidelines, and conflicts of interests among guideline development group members and funders. The committee explored the literature devoted to empirical assessments of guideline development methodologies, and an array of guideline quality appraisal instruments. Although guideline quality has improved over the past several decades, improvement has been too slow, and the quality of many guidelines remains subpar. Furthermore, past and current quality appraisal instruments do not sufficiently address all components of the guideline development process, particularly the rating of evidence quality and recommendation strength, nor are they intended for prospective application to development of high-quality, trustworthy guidelines.
Efforts to promote high-quality development of clinical practice guidelines (CPGs) have met challenges and controversy. The fol-
lowing section describes issues undermining the trustworthiness and impact of CPGs (illustrated in the case studies presented in Boxes 3-1, 3-2, and 3-3), including many associated with the guideline development process. These issues include limitations in the scientific evidence on which CPGs are based; lack of transparency of development groups’ methodologies, especially in deriving recommendations and determining their strength; conflicting guidelines; and challenges of conflict of interest (COI). Additional factors threatening CPG trustworthiness and influence are reflected in tensions among guideline developers and users with respect to balancing desires for evidence-based recommendations with clinician desires for guidance on clinical situations in which great uncertainty exists. Resource limitations in guideline development and updating present further challenges to the promise of high-quality, effective guidelines. Overall, though researchers have reported empirical evidence of modest gains in guidelines’ quality, there is substantial room for improvement (Shaneyfelt and Centor, 2009). The committee did not identify comprehensive and adequate standards for development of unbiased, scientifically valid, and trustworthy CPGs. Hence, the committee formulated and proposed new standards for developing trustworthy CPGs, as explained in the following two chapters.
DEVELOPMENT OF EVIDENCE-BASED CPGs
Concerns Regarding Bias, Generalizability, and Specificity
Appreciation for evidence-based medicine has grown over the past several decades, due in part to increased interest in and funding for clinical practice research, and improvements in associated research methodologies. However, many CPG experts and practicing clinicians increasingly regard the scientific evidence base with suspicion for a variety of reasons, including gaps in evidence, poor-quality research and systematic reviews, biased guideline developers, and the dominance of industry-funded research and guideline development. A 2005 study found that industry sponsored approximately 75 percent of clinical trials published in The Lancet, New England Journal of Medicine, and Journal of the American Medical Association (The House of Commons Health Committee, 2005). Two-thirds of this industry-sponsored published research is directly conducted by profit-making research companies and one third by academic medical centers. Furthermore, even high-quality commercial clinical investigations (e.g., those included in Cochrane Reviews) are
5.3 times more likely to endorse their sponsors’ products than non-commercially funded studies of identical products (Als-Nielsen et al., 2003). Much of this industry-sponsored research is conducted for Food and Drug Administration approval of medications. Approval requires careful evaluation and demonstrated efficacy for given indications. However, there are important limitations on the meaning of such approval for clinical practice. Because preapproval studies designed by a drug’s manufacturer often follow patients for relatively brief periods of time, involve comparatively small numbers of younger and healthier patients than the drug’s target population, may rely on comparison with placebo only, and often use surrogate endpoints, the value of these studies for the development of useful CPGs can be severely limited (Avorn, 2005).
Guideline developers and users emphasize that guideline recommendations should be based on only the most methodologically rigorous evidence, whether in the form of randomized controlled trials (RCTs) or observational research, where current guidelines often fall short (Coates, 2010; Koster, 2010). However, even when studies are considered to have high internal validity, they may not be generalizable to or valid for the patient population of guideline relevance. Randomized trials commonly have an underrepresentation of important subgroups, including those with comorbidities, older persons, racial and ethnic minorities, and low-income, less educated, or low-literacy patients. Many RCTs and observational studies fail to include such “typical patients” in their samples; even when they do, there may not be sufficient numbers of such patients to assess them separately or the subgroups may not be properly analyzed for differences in outcomes. Investigators often require that patients have new disease onset, have no or few comorbid conditions, and/or be relatively young and sociodemographically limited (Brown, 2010). A 2007 evaluation of the quality of evidence underlying therapeutic recommendations for cardiovascular risk management found that only 28 percent of 369 recommendations (in 9 national guidelines) were supported by high-quality evidence. The most frequent reason for downgrading quality of RCT-based evidence was concern about extrapolating from highly selected RCT populations to the general population (McAlister et al., 2007). Failure to include major population subgroups in the evidence base thwarts our ability to develop clinically relevant, valid guidelines (Boyd, 2010). A 2005 study found that seven of nine guidelines studied did not modify or discuss the applicability of recommendations for older patients with multiple morbidities (Boyd et al., 2005).
Lack of Transparency in Recommendations’ Derivation
A second criticism of the current state of CPG development is lack of transparency in deriving and rating the strength of recommendations. Representatives from Kaiser Permanente and Partners Healthcare in Massachusetts, who evaluate and use guidelines in patient care, noted that the major weaknesses of CPGs were wide variation in transparency in guideline development processes and products and omission of description of processes for consensus-based recommendations (particularly when evidence is absent or poor). The 2006 investigation by Connecticut’s attorney general into the Infectious Diseases Society of America’s Lyme Disease Guidelines (Box 3-1) is illustrative. Although commentaries have described this case as a “politicization of professional practice guidelines” (Kraemer and Gostin, 2009, p. 665), with the attorney general “[substituting] his judgment for that of medical professionals” (Ferrette, 2008, p. 2), this case highlights the need for standardization and transparency in all aspects of systemic data collection and review, committee administration, and guideline development, so that these issues do not detract from the science. GDGs must be aware of the many, varied observers who will consider their development processes, particularly when their recommendations are likely to be controversial.
Although certain empirical evidence indicates guideline developers mostly have adopted the practice of rating the strength of evidence and recommendations (of those in the National Guideline Clearinghouse, or NGC, 158, or 77 percent, of 204 use some sort of rating scheme), roughly 70 percent (142 of 204 developers) do not identify the origins of their rating schemes, and appear to be using ones unique to their organizations (Coates, 2010). Although many GDGs claim that their recommendations are informed by a systematic review of the evidence, few include the details of their evidence reviews in their guidelines, leaving many users skeptical of their claims. Furthermore, a large percentage of guidelines submitted to the NGC also are “vague” and “ambiguous” and lacking in “explicit recommendations” (Coates, 2010). Even given a diversity of backgrounds and perspectives (i.e., guideline methodologists from medical specialty societies, practicing clinicians, payers, and representatives from integrated health systems), the committee found broad consensus among stakeholders urging guideline developers to articulate clearly the full evidentiary rationale in support of recommendations, as well as methods for deriving recommendation strength (Bottles, 2010; Coates, 2010; Jacques, 2010; Koster, 2010).
Infectious Diseases Society of America Lyme Disease Guidelines (2006)
In a fall 2006 practice guideline, the Infectious Diseases Society of America (IDSA) addressed the controversial diagnosis of chronic Lyme disease: “There is no convincing biologic evidence for the existence of symptomatic chronic B. burgdorferi infection among patients after receipt of recommended treatment regimens for Lyme disease. Antibiotic therapy has not proven to be useful and is not recommended for patients with chronic (≥ 6 months) symptoms after recommended treatment regimens for Lyme disease (E-1).” Here, E denotes a recommendation strongly against an action and 1 refers to evidence from one or more properly randomized, controlled trials (Wormser et al., 2006, p. 1094).
Concerned the new IDSA guidelines would impact insurance reimbursements, advocacy groups immediately objected, citing concerns about the IDSA guideline development group’s bias and an incomplete review of the data (Johnson and Stricker, 2009). In November 2006, Connecticut Attorney General Richard Blumenthal, himself personally active in support of Lyme disease advocates (McSweegan, 2008), conducted an antitrust investigation against the IDSA, alleging that the broad ramifications of its guidelines require it to use a fair, open development process free from conflicts of interest (Johnson and Stricker, 2009).
At the culmination of its investigation, the Connecticut Attorney General’s Office (AGO) questioned the objectivity of the process by which the guideline review committee was selected, the lack of opportunity for interested third parties to provide input, and conflicts of interest of committee members—despite disclosure in the guideline document (Connecticut Attorney General’s Office, 2008). In addition, the AGO expressed concern that several IDSA Committee members had concomitantly served on another panel, for the American Academy of Neurology, which discussed and issued a related “Practice Parameter” about chronic Lyme disease in 2007 (Halperin et al., 2007). Refuting these claims, the IDSA maintained that committee members were chosen based on clinical and scientific expertise and that the guideline represented a thorough, peer-reviewed analysis of all available literature and resources (IDSA, 2008; Klein, 2008). Although the guideline does not describe processes for committee selection and guideline development, the document did grade both the strength of its recommendations and evidence quality using a standard scale.
Following nearly 18 months of investigation and $250,000 in legal fees (Klein, 2008), the IDSA entered into a non-punitive agreement with the Attorney General’s Office, voluntarily committing to a one-time structured review of their 2006 guidelines to, according to IDSA President Dr. Donald Poretz, “put to rest any questions about them” (IDSA, 2008, online). In Summer 2009, the new Committee, with the oversight of a jointly-appointed ombudsman, met to gather additional evidence for their guideline review, and shortly thereafter unanimously agreed to uphold its 2006 guideline recommendations (IDSA, 2010).
To many CPG users, one of the most pressing problems in the current CPG landscape is existence of conflicting guidelines within many disease categories. The Centers for Medicare & Medicaid Coverage and Analysis Group director told the committee, “We are also challenged with dueling guidelines across specialties. What do you do when you have the interventional radiologist versus the surgeon versus medical management?” (Jacques, 2010). For example, in 2008 the U.S. Preventive Services Task Force (USPSTF) and a panel of the American Cancer Society (ACS/MSTF/ACR) published guidelines on colorectal cancer screening objectives and modalities for its detection, with divergent recommendations (described in Box 3-2) within 6 months of one another. This example illustrates how the composition and interests of a GDG may impact its decision mak-
Colorectal Cancer Screening Guidelines (2008)
In 2008, independent colorectal cancer screening guidelines were published by the U.S. Preventive Services Task Force (USPSTF) and a joint panel of the American Cancer Society, the U.S. Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology (ACS-MSTF-ACR). Although published within 6 months of each other, the two guidelines offer divergent recommendations about the goals of screening as well as the use of specific diagnostic modalities (Goldberg, 2008; Pignone and Sox, 2008). The ACS-MSTF-ACR joint guideline’s support for newer technologies such as stool DNA and CT colonoscopy (CAT Scan), as well as its prioritization of “structural” diagnostic modalities such as colonoscopy, contrast with the USPSTF’s statement, which did not recommend their use, resulting in confusion among physicians and patients (Goldberg, 2008).
Differences in development methodologies and committee composition likely contribute to the divergence (Imperiale and Ransohoff, 2010; Pignone and Sox, 2008). To inform its work, the USPSTF drew on findings of a commissioned systematic review and benefit/risk simulation modeling (Pignone and Sox, 2008). The USPSTF methods were predefined, rigorous, and quantitative and they separated the systematic review process from that of guideline development (Imperiale and Ransohoff, 2010). However, Pignone and Sox (2008, p. 680) describe “some surprising choices” and missing analyses (e.g., cost/Quality Adjusted Life Years [QALY]) in the data modeling. Of the USPSTF’s processes, ACS panelist Tim Byers noted in The Cancer Letter: “Even though they say this is a systematic review and it’s quantitative and it’s all very orderly, some of those key aspects are judgment calls” (Goldberg, 2008, p. 3). In the joint ACS-MSTF-ACR guideline, the panel only briefly describes its evidence review method and offers no
ing. The ACS/MSTF/ACR guideline development group, composed primarily of gastroenterologists and radiologists placed higher priority on newer tests that were most often utilized by the specialties represented on the panel; while the USPSTF, composed exclusively of generalists and methodologists, recommended otherwise. Consequently, some outside observers worry that the recommendations of these groups are predictable, based on the committee members’ interests, rather than the evidence. No system is currently accepted for achieving consensus among conflicting sets of guidance. Conflicting guidelines most often result when evidence is weak; developers differ in their approach to evidence reviews (systematic vs. nonsystematic), evidence synthesis or interpretation; and/or developers have varying assumptions about intervention benefits and harms. Conflict of interest (discussed more fully below) may also
insight into its consensus-building process. Imperiale and Ransohoff (2010) report that “the process of evidence review was not clearly separated from the process of guidelines-making” and that “no pre-stated process [was] used to translate evidence into recommendations, nor was the strength of recommendations graded” (Imperiale and Ransohoff, 2010, p. 5). The joint ACS-MSTF-ACR guideline document codifies two guiding principles that informed their recommendations: (1) the importance of one-time test sensitivity (e.g., a requirement that a test achieve > 50 percent sensitivity with a single use), given poor adherence to lower sensitivity program approaches, and (2) the primacy of colon cancer prevention in screening efforts (Levin et al., 2008). Commentaries on the guideline raise concerns about oversimplifications inherent in these decisions (Imperiale and Ransohoff, 2010) and note that this is the only guideline in which the American Cancer Society has adopted and expressed such guiding principles (Goldberg, 2008).
The USPSTF panel was composed of generalist physicians and methodologists (Imperiale and Ransohoff, 2010); the ACS-MSTF-ACR committee consisted of medical specialists and experts in the fields of radiology, gastroenterology, and oncology (Bottles, 2010; Goldberg, 2008). Bernard Levin, a member of the joint panel, remarked in The Cancer Letter, “It is extremely hard to bring disparate professional groups together, to have them operate totally out of objectivity, not because they are bad people, but because they see the world through different lenses. Everyone, in some respects, has their vested interests” (Bottles, 2010; Goldberg, 2008, p. 3; Jacques, 2010). Such sentiments have been echoed in multiple commentaries relating to clinical practice guidelines, with authors recognizing that bias extends beyond financial interests to include intellectual and emotional interests as well (Lederer, 2007). As of March 2010, no updates had been made to the guidelines of either organization.
play a role, and value judgments inevitably influence translation of scientific evidence to clinical recommendations (IOM, 2009). The NGC has identified at least 25 different conditions in which conflicting guidelines exist (Coates, 2010).
Conflict of Interest
Conflict of interest among guideline developers continues to be a worrisome area for guideline users. Public forum testimony
National Kidney Foundation’s Kidney Disease and Outcomes Quality Initiative Anemia Management Guidelines (2006)
As one of Medicare’s largest pharmaceutical expenses—costing $1.8 billion in 2007 (USRDS, 2009)—erythropoietin has attracted widespread attention (Steinbrook, 2007). Recombinant erythropoietin stimulates receptors in the bone marrow, resulting in increased red blood cell production and a “natural” treatment for anemia (low blood hemoglobin), a common consequence of chronic kidney disease (CKD) (NKF, 2006).
In 2006, when the National Kidney Foundation (NKF) published a new series of Kidney Disease Outcomes Quality Initiative (KDOQI) guidelines, the ideal hemoglobin target for CKD was unclear. The prior KDOQI guidelines, published in 2001, had recommended a range of 11–12 g/dL, striking a balance between the improved quality of life and medical benefits resulting from correction of very low hemoglobin levels and the uncertain value of raising hemoglobin levels more significantly (NKF, 2001). The 16-person 2006 KDOQI Anemia Work Group, citing “insufficient evidence” to produce a guideline, instead issued a clinical practice recommendation about the upper limit of its hemoglobin range: “In the opinion of the Work Group, there is insufficient evidence to recommend routinely maintaining [hemoglobin] levels at 13 g/dL or greater in (Erythropoiesis Stimulating Agents) ESA-treated patients” (NKF, 2006, p. S33). As Coyne (2007b, p. 11) wrote, in the 2006 KDOQI guidelines, “the upper hemoglobin limit was increased to 13 g/dL, despite the lack of sufficient evidence that a hemoglobin target of 12 to 13 g/dL is as safe or results in a significant increase in the quality of life compared with 11 to 12 g/dL.”
According to the Work Group, the widened target (now 11–13 g/dL) would be more practical for physicians and patients, although the Work Group cautioned against the medical risks of routinely exceeding the recommended upper bound (NKF, 2006). The guideline document further described a need for additional data and references two recently completed, applicable randomized controlled trials (RCTs) that had data not yet used in guideline development because they had not yet been published (NKF, 2006).
A 2006 Cochrane review—published after the guidelines, but based on the same literature available to the KDOQI panel—found no all-cause
to the committee indicated that COI is particularly concerning to many types of stakeholders. One example that captured media and public attention is the direct financial or research ties that development panelists had with the drug manufacturer that funded the National Kidney Foundation’s Kidney Disease and Outcomes Quality Initiative anemia management guidelines (depicted in Box 3-3). Recent research findings provide further evidence of the pervasiveness of COI in guideline development. Choudhry et al. (2002) surveyed 100 individual authors across 37 guidelines, and found that
mortality benefit to raising hemoglobin levels to ≥ 13.3 g/dL, compared with 12 g/dL (Strippoli et al., 2006). Within the year, data from two large-scale RCTs and a meta-analysis demonstrated no cardiovascular benefit and an increased risk of adverse events and all-cause mortality associated with maintenance of CKD patients at levels between 12 and 16 g/dL (Drueke et al., 2006; Phrommintikul et al., 2007; Singh et al., 2006). Seeing the discrepancy between these data and the recently released KDOQI guidelines, several critics questioned the timing of the new KDOQI guideline release, the “rules of evidence” used by the KDOQI Work Group, and the significant industry sponsorship and conflicts of interest of Work Group members (Coyne, 2007a,b; Ingelfinger, 2007; Steinbrook, 2006).
Specifically, Coyne (2007a) questioned the KDOQI Anemia Work Group’s decision to release the updated guidelines in early 2006, in the absence of new definitive data, especially when two highly applicable, large-scale RCTs were known to be shortly available. In addition, in an editorial accompanying the publication of the two RCTs in the New England Journal of Medicine, Remuzzi and Ingelfinger (2006, p. 2144) noted that the NKF–KDOQI guidelines were “not based on persuasive randomized, controlled trials,” and other authors questioned the decision of the Work Group to not review unpublished data and abstracts in the review process (Coyne, 2007a; IOM, 2008). Coyne (2007a) also raised significant concerns about conflicts of interest among the NKF’s KDOQI Anemia Work Group because the guidelines were published bearing the logo of Amgen (a major U.S. manufacturer of erythropoietin and the KDOQI’s “founding and principal sponsor”) on the front cover. The majority of panelists had direct financial or research-based ties with erythropoietin manufacturers or marketers (Steinbrook, 2006), and the Wall Street Journal reported significant financial support for the committee work by Amgen (Armstrong, 2006).
In a defense of the guideline development process, Van Wyck et al. (2007, p. 8) emphasized the “scientific and methodological rigor” of the guideline development, including standardization of evidentiary review, an intensive internal and public two-stage review process, and full conflict-of-interest disclosure and formal restrictions on members’ contacts with sponsors during guideline development.
In light of the new evidence, the KDOQI guidelines were reissued in 2007, recommending an upper bound hemoglobin target of 13 g/dL (Levin and Rocco, 2007).
87 percent had a financial relationship with industry and 59 percent had financial relationships with companies whose products were considered in a guideline (of the 59 percent, 64 percent received speaking honorariums and 38 percent were company employees or consultants). The majority of respondents reported no discussion or disclosure of financial relationships with industry among panel participants during the guideline development process (Choudhry et al., 2002). A 2008 analysis of NGC guideline summaries found that 47 percent indicated “Not stated” in responding to a financial disclosure/conflict of interest query.1 The proportion of summaries, including information on financial relationships or COI, increased from just over 20 percent to approximately 50 percent from 1999 to 2006 (Tregear, 2007). Chapter 4 discusses strategies for managing COI by organizations such as the American College of Cardiology and American Heart Association (ACC/AHA), American Thoracic Society, USPSTF, and the American College of Physicians and recommends a practice standard.
Funding and Resource Limitations
Funding and resource limitations remain top concerns for many CPG developers, as reported to the committee. According to an international survey of guideline developers from 2003, the average budget for a single guideline developed in the United States was $200,000, not including any additional dissemination costs, which could reach $200,000 per guideline (Burgers et al., 2003a). Guideline developers experience many resource and time constraints, and many entities cannot afford to undertake in-depth evidence syntheses. Additionally, obtaining non-conflicted sources of funding for development and updating of CPGs remains a major challenge. The value of autonomy from industry funding, limited funding available for staff and other support, and limited public grant opportunities are prevailing challenges (Coates, 2010; Fochtmann, 2010).
Contrasting Viewpoints on Scope and Purpose
Guideline developers’ and users’ views on purpose and scope of guidelines may create tensions on the best way to approach development. Certain developers advocate restricting guideline development and recommendations to clinical domains associated with
strong evidence, which would result in far fewer guidelines (Lewis, 2010). By contrast, many medical specialty society members and payers seek guidance, particularly in contentious clinical areas, even when evidence is scarce (Milliman Inc., 2009). Some guideline developers have attempted to accommodate both needs by distinguishing CPG recommendations based on high-quality evidence from statements about practice based on expert opinion. However, this approach often has led to confusion among users who are unaware of the varying systems for evidence rating, and some developers continue to struggle with how to differentiate low- to mid-level evidence-based recommendations from those of high quality without compromising guideline usability.
A second tension reflects debate about whether guidelines should detail recommendations for the full continuum of care for a condition(s), or focus on a few key recommendations that are well supported by evidence, easy to implement, and perhaps can be translated into quality measures. Guidelines traditionally have followed the former model, with developers often taking pride in the level of detail and work associated with their products. However, time and resource constraints are significant factors because comprehensive documents take years to complete and costs are high (Lewis, 2010).
EMPIRICAL ASSESSMENT OF GUIDELINE DEVELOPMENT METHODOLOGY
A literature devoted to assessing methodological quality of CPG development reveals uneven progress over the past 25 years. A 1999 study by Shaneyfelt and colleagues evaluated 279 CPGs from 69 developers, published from 1985 to 1997. The authors found the methodological quality of guidelines improved from 1985 to 1997, as the overall percentage of quality indicators (located in Appendix C) satisfied by all developers increased from 36.9 to 50.4 percent. The greatest advance occurred in development and format (41.5 percent satisfied quality criteria before 1990, 55.9 percent after 1995), and there was little change in evaluation of evidence (34.6–36.1 percent from before 1990 to after 1995). Modest improvements were found in formulation of recommendations (42.8–48.4 percent). A relatively low percentage of guideline developers (less than 10 and 20 percent, respectively) described formal methods of combining scientific evidence and expert opinion and specified how evidence was identified. Moreover, at least one quarter failed to cite any literature basis. Finally, while 89.6 percent specified patient or practice char-
acteristics justifying acceptance or rejection of recommendations, only 21.5 percent detailed the precise role of patient preferences in decision making, and only 6 percent of guidelines described values applied by developers in formulating recommendations (Shaneyfelt et al., 1999).
Similarly, Grilli and colleagues (2000) examined methodological quality of 431 guidelines developed by medical specialty societies between 1988 and 1998. Overall, most guidelines did not meet their three assessment criteria: 67 percent reported no description of stakeholders, 88 percent did not report search strategies for published studies, and 82 percent did not explicitly rate strength of recommendations. All 3 criteria were met in only 22 guidelines, or 5 percent. However, guidelines improved over time with regard to providing search information (from 2 to 18 percent) and explicit grading of evidence (from 6 to 27 percent). The authors concluded that despite evidence of moderate progress, the quality of practice guidelines developed by specialty societies remained unsatisfactory (Grilli et al., 2000).
A 2003 study by Hansenfeld and Shekelle compared methodological quality of 17 guidelines published by the Agency for Health Care Policy and Research (AHCPR) to subsequent non-AHCPR guidelines published in the same topical areas. The authors state, “In contrast to the findings of Shaneyfelt et al. and Grilli et al. that the methodological quality of guidelines has been improving over time, we found that newer guidelines in the same topic areas as the AHCPR guidelines were sharply and disturbingly poorer in methodological quality than the AHCPR guidelines” (Hasenfeld and Shekelle, 2003, p. 433). Using the Appraisal of Guidelines Research & Evaluation (AGREE) instrument (found in Appendix C), the authors found that, overall, AHCPR guidelines met the most standards, scoring 80 percent or more on 24 of 30 criteria. Non-AHCPR guidelines (updates and adapted AHCPR guidelines) scored 80 percent on 14 and 11 of 30 criteria, respectively. All 17 AHCPR guidelines used both multidisciplinary panels and systematic reviews of the literature; by comparison, guidelines updated and adapted by non-AHCPR entities used multidisciplinary panels and systemic reviews 40 and 60 percent of the time, respectively. However, the AHCPR guidelines had lower scores on several criteria: none of the AHCPR guidelines applied and described formal methods of combining evidence or expert opinion (though non-AHCPR guidelines fared only slightly better at 3 percent), only 2 (12 percent) AHCPR guidelines specified expiration dates compared with 12 (40 percent) non-AHCPR guidelines, and finally, 1 (6 percent) AHCPR guideline discussed the role of value judgments
in formulating recommendations while 6 (20 percent) of the non-AHCPR guidelines did so (Hasenfeld and Shekelle, 2003).
A 2009 study examining 12 years of Canadian guideline development, dissemination, and evaluation drew similar conclusions of uneven progress. After evaluating 730 guidelines from 1994 to 1999 and 630 from 2000 to 2005, Kryworuchko and colleagues (2009) concluded that over time, developers were more likely to use and publish computerized literature search strategies, and reach consensus via open discussion. Unfortunately, developers were less likely to support guidelines with literature reviews. Kryworuchko et al. concluded that “Guidelines produced more recently in Canada are less likely to be based on a review of the evidence and only half discuss levels of evidence underlying recommendations” (Kryworuchko et al., 2009, p. 1).
Finally, in a 2009 follow-up article to Shaneyfelt and colleagues’ 1999 study, Shaneyfelt and Centor lamented the current state of guideline development, specifically the overreliance on expert opinion and inadequate management of COI, with some examples drawn from recent ACC/AHA guidelines (Shaneyfelt and Centor, 2009). Some of Shaneyfelt’s and Centor’s conclusions were refuted by past chairs of the ACC/AHA Task Force on Practice Guidelines, who defended the methodological quality of the Task Force’s guidelines and their development policies (Antman and Gibbons, 2009).
STANDARDIZING GUIDELINE DEVELOPMENT QUALITY APPRAISAL
Some studies have demonstrated that clinical practice guidelines can improve care processes and patient outcomes (Fervers et al., 2005; Ray-Coquard et al., 1997; Smith and Hillner, 2001). When rigorously developed, CPGs have the power to translate the complexity of scientific research findings and other evidence into recommendations for clinical care action (Shiffman et al., 2003). However, CPG development is fraught with challenges. Certain characteristics of guidelines can play a vital part in guideline effectiveness (Grol et al., 2005). CPG guidelines of high methodological rigor can enhance healthcare quality (Grimshaw et al., 2004), and low-quality guidelines may degrade it (Shekelle et al., 2000). Although experimental demonstrations are not available to suggest that provision of formal development guidance leads to improved quality of care, observational evidence indicates that CPGs produced within a structured environment, in which a systematic procedure or “Guidelines for Guidelines” are available to direct production are more likely to be of higher quality (Burgers et al., 2003b; Schünemann et al., 2006). More specifically, non-stan-
dardized development results in substantial troubling variation in clinical recommendations (Beck et al., 2000; Schünemann et al., 2006). Furthermore, guideline development methodology has been shown to be enhanced by appraisal instruments; this has been explained in part by their service as “aide-memoire(s)” for guideline developers (Cluzeau and Littlejohns, 1999).
Hence, the accepted notion is that standards regarding quality should guide CPG development (Feder et al., 1999; Shaneyfelt et al., 1999). Calls are increasing for international standards to hasten rigorous CPG development and appraisal (Grilli et al., 2000; Grol et al., 2003; Shaneyfelt et al., 1999; Shaneyfelt and Centor, 2009). The definition of quality guidelines put forth by the AGREE Collaboration is as follows: “the confidence that the potential biases inherent in guideline development have been addressed adequately and that the recommendations are both internally and externally valid (i.e., supported by evidence and applicable to target populations), and are feasible for practice” (AGREE, 2001, p. 2). This definition has been commonly adopted in the scientific literature (Burgers et al., 2003b; Grol et al., 2003). Although uniformly endorsed standards for quality CPG development do not yet exist, there is widespread agreement regarding basic elements of guidelines quality (Schünemann et al., 2006; Shaneyfelt et al., 1999; Turner et al., 2008; Vlayen et al., 2005). This agreement is reflected across multiple, varied sources, including detailed procedures for guideline development or “handbooks” produced by governments (AHRQ, 2008; New Zealand Guidelines Group, 2001; NICE, 2009; SIGN, 2008); professional organizations such as ACC/AHA, American College of Chest Physicians, and American Thoracic Society (ACCF and AHA, 2008; Baumann et al., 2007; Schünemann et al., 2009); and individual leaders in the field (Rosenfeld and Shiffman, 2009).
Overall, development handbooks address the following central elements of the guideline development process: establishment of a multidisciplinary guideline development group, consumer involvement, identification of clinical questions or problems, systematic searches and appraisal of research evidence, procedures for drafting recommendations, external consultation, and ongoing review and update (Turner et al., 2008).
Moreover, a number of taxonomies have been devised for the purposes of guideline methodology quality appraisal and/or improved reporting of guideline development processes (AGREE, 2003; Brouwers et al., 2010; Cluzeau et al., 1999; IOM, 1992; Shaneyfelt et al., 1999; Shiffman et al., 2003). The IOM published the first CPG appraisal instrument (found in Appendix C) in its 1992 report
Guidelines for Clinical Practice: From Development to Use. In a 2005 systematic review of CPG appraisal instruments, Vlayen and colleagues reported that since 1995, 22 appraisal tools have been designed in 8 countries: 6 in the United States, 5 in Canada, 4 in the United Kingdom, 2 each in Australia and Italy, and 1 each in France, Germany, and Spain. Eleven of these instruments are based on the original IOM tool, while several others arose from Hayward et al. (1993) or Cluzeau (1999) (both described in Appendix C) (Cluzeau et al., 1999; Hayward et al., 1993). The tools vary in number (3–52) of guideline attributes considered, availability and form (qualitative or numeric) of scoring systems, and whether they have been subject to validation.
The majority of CPG appraisal tools have been published in peer-reviewed journals (Vlayen et al., 2005). In essence, they share many commonalities captured within the generic AGREE instrument (and the updated AGREE II ) which contains the same domains (Brouwers et al., 2010), which has been widely adopted (Rosenfeld and Shiffman, 2009) and measures the following domains and dimensions of quality development, as follows:
Explicit scope and purpose: The overall objective(s), clinical questions, and target population are explicated.
Stakeholder involvement: Patient(s) are involved in guideline development and all audiences are defined clearly and involved in pilot-testing.
Rigor of development: Recommendations are linked explicitly to supporting evidence and there is discussion of health benefits or risks; recommendations are reviewed externally before publication and development group provides details of updating.
Clarity of presentation: Recommendations are not ambiguous and do consider different possible options; key recommendations are easily identified; and a summary document and patient education materials are provided.
Applicability: Organizational changes and cost implications of applying recommendations and review criteria for monitoring guidelines use are explicated.
Editorial independence: Views or interests of the funding body have not influenced final recommendations; members of the guideline group have declared possible conflicts of interest. (Increased detail of the AGREE instrument and its relatives is provided within Appendix C). The IOM has asserted that each such attribute “affects the likelihood that guidelines will be perceived as trustworthy and useable
or the probability that they will, if used, help achieve the desired health outcomes” (Graham et al., 2000).
As discussed earlier in this chapter, methodological quality of CPGs has been unreliable and advancing unsatisfactorily for decades, despite the existence of guideline appraisal tools such as AGREE. Specifically, empirical evidence supports that quality of guidelines’ development processes suffers from a large number of weaknesses across the variety of aforementioned established quality domains (Grilli et al., 2000; Hasenfeld and Shekelle, 2003; Kryworuchko et al., 2009; Shaneyfelt et al., 1999). Furthermore, existing CPG development appraisal instruments do not capture all relevant quality domains. For example, in Vlayen’s systematic review, one quarter of instruments omitted transparency and external review. One quarter excluded certain dimensions of derivation and rating of recommendations, such as patient preferences and patient exclusions. One quarter neglected to address updating, and one quarter failed to include the multidisciplinary composition of a guideline development team. Approximately 85 percent did not capture implementation feasibility and more than 40 percent excluded details of recommendation articulation such as wording clarity. With the exception of the Conference on Guideline Standardization (COGS), none of the tools specifies numeric appraisal of the evidentiary foundations for clinical recommendations (Vlayen et al., 2005). Graham’s complementary review of guideline appraisal instruments asserts that overall, there appears to be little evidence underlying inclusion of most theoretical domains reflected in the instruments. Direct empirical underpinnings are omitted from accompanying documentation (Graham et al., 2000).
It is important to underscore that this body of guideline appraisal tools overwhelmingly focuses on process and format. Only a small number attend to particulars of guideline clinical content and clinical value (e.g., quality of evidence and strength of recommendations) (Graham et al., 2000). There is no agreed-on standard put forth for prospective enhancement of high-quality, trustworthy guidelines (Shiffman et al., 2003). Moreover, as appraisal and reporting tools, they are designed for retrospective assessment and documentation of released guidelines rather than prospective application to development of high-quality, trustworthy CPGs (Shiffman et al., 2003). As further elaboration, COGS is limited to reporting of the guideline development process, as its authors attest: “[COGs] is not intended to dictate a particular guideline development methodology . . . we believe that COGS can be used most effectively to identify nec-
essary components that should be documented in guidelines but should not be used (alone) to judge guideline quality or adequacy” (Shiffman et al., 2003, pp. 495–496). The AGREE instrument is the only instrument to have been validated it has wide acceptance; however, it only assesses the quality of reporting and the quality of “some aspects of recommendations” (AGREE, 2001, p. 2) and, like its peers, AGREE fails specifically with regard to evidence quality appraisal (Vlayen et al., 2005). “The AGREE instrument is designed to assess the process of guideline development and how well it is reported. It does not assess the clinical content of the guideline nor the quality of evidence that underpins the recommendations” (AGREE, 2003, p. 18).
Complementing empirical study of the validity of quality appraisal tools and description of their adoption, is literature devoted to the validity of individual quality components within these tools. This work sheds further light on the nuances of deficiencies inherent in the state-of-the-art of development of CPGs, with emphatic attention given to subtle dimensions (e.g., explicit scope and purpose, applicability, editorial independence) and operational details of methodological quality criteria, including conflict of interest, the role of judgment in recommendations derivation, recommendations prioritization, development group composition, patient-centeredness (including patient preferences and comorbidity), and implementation feasibility (Choudhry et al., 2002; Graham et al., 2000; Grol et al., 2003; Guyatt et al., 2010; Shaneyfelt and Centor, 2009; Sniderman and Furberg, 2009).
The standards for development of trustworthy CPGs delineated in the chapters to follow arose from the committee’s investigation of the evidence bases synthesized above. Full details of the spectrum of research methods supporting the committee’s standards setting are contained in Chapter 1.
ACCF and AHA (American College of Cardiology Foundation and American Heart Association). 2008. Methodology manual for ACCF/AHA guideline writing committees. In Methodologies and policies from ACCF/AHA Taskforce on Practice Guidelines. ACCF and AHA.
AGREE (Appraisal of Guidelines for Research & Evaluation). 2001. Appraisal of Guidelines for Research & Evaluation (AGREE) Instrument. The AGREE Collaboration. www.agreecollaboration.org (accessed November 10, 2010).
AGREE. 2003. Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care 12(1):18–23.
AHRQ (Agency for Healthcare Research and Quality). 2008. U.S. Preventive Services Task Force procedure manual. AHRQ http://www.healtheducationadvocate.org/Summit/. No. 08-05118-ef. http://www.ahrq.gov/clinic/uspstf08/methods/procmanual.htm (accessed February 12, 2009).
Als-Nielsen, B., W. Chen, C. Gluud, and L. L. Kjaergard. 2003. Association of funding and conclusions in randomized drug trials: A reflection of treatment effect or adverse events? JAMA 290(7):921–928.
Antman, E. M., and R. J. Gibbons. 2009. Clinical practice guidelines and scientific evidence. JAMA 302(2):143–144; author reply 145–147.
Armstrong, D. 2006. Medical journal spikes article on industry ties. The Wall Street Journal, December 26, B1.
Avorn, J. 2005. Powerful medicines: The benefits, risks, and costs of prescription drugs. Rev. and updated, 1st Vintage Books ed. New York: Vintage Books.
Baumann, M. H., S. Z. Lewis, and D. Gutterman. 2007. ACCP evidence-based guideline development: A successful and transparent approach addressing conflict of interest, funding, and patient-centered recommendations. Chest 132(3):1015–1024.
Beck, C., M. Cody, E. Souder, M. Zhang, and G. W. Small. 2000. Dementia diagnostic guidelines: Methodologies, results, and implementation costs. Journal of the American Geriatrics Society 48(10):1195–1203.
Bottles, K. 2010. Institute for Clinical Systems Improvement (ICSI). Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Boyd, C. 2010. CPGs for people with multimorbidities. Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Boyd, C. M., J. Darer, C. Boult, L. P. Fried, L. Boult, and A. W. Wu. 2005. Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: Implications for pay for performance. JAMA 294(6):716–724.
Brouwers, M. C., M. E. Kho, G. P. Browman, J. S. Burgers, F. Cluzeau, G. Feder, B. Fervers, I. D. Graham, J. Grimshaw, S. E. Hanna, P. Littlejohns, J. Makarski, and L. Zitzelsberger. 2010. AGREE II: Advancing guideline development, reporting and evaluation in health care. Canadian Medical Association Journal 182(18):E839–842.
Brown, A. 2010. Clinical practice guidelines: Implications for vulnerable patients: Development of geriatric diabetes guidelines. Paper presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Burgers, J., R. Grol, N. Klazinga, M. Makela, J. Zaat, and AGREE Collaboration. 2003a. Towards evidence-based clinical practice: An international survey of 18 clinical guideline programs. International Journal on Quality Health Care 15(1):31–45.
Burgers, J. S., F. A. Cluzeau, S. E. Hanna, C. Hunt, and R. Grol. 2003b. Characteristics of high-quality guidelines: Evaluation of 86 clinical guidelines developed in ten European countries and Canada. International Journal of Technology Assessment in Health Care 19(1):148–157.
Choudhry, N. K., H. T. Stelfox, and A. S. Detsky. 2002. Relationships between authors of clinical practice guidelines and the pharmaceutical industry. JAMA 287(5):612–617.
Cluzeau, F. A., and P. Littlejohns. 1999. Appraising clinical practice guidelines in England and Wales: The development of a methodologic framework and its application to policy. Joint Commission Journal of Quality Improvement 25(10):514–521.
Cluzeau, F. A., P. Littlejohns, J. M. Grimshaw, G. Feder, and S. E. Moran. 1999. Development and application of a generic methodology to assess the quality of clinical guidelines. International Journal for Quality in Health Care 11(1):21–28.
Coates, V. 2010. National Guidelines Clearinghouse/ECRI Institute. Paper presented at the IOM Committee on Developing Standards for Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Connecticut Attorney General’s Office. 2008. Attorney general’s investigation reveals flawed Lyme disease guideline process, IDSA agrees to reassess guidelines, install independent arbiter. Hartford, CT: Connecticut Attorney General’s Office.
Coyne, D. W. 2007a. Influence of industry on renal guideline development. Clinical Journal of the American Society of Nephrology 2(1):3–7.
Coyne, D. W. 2007b. Practice recommendations based on low, very low, and missing evidence. Clinical Journal of the American Society of Nephrology 2(1):11–12.
Drueke, T. B., F. Locatelli, N. Clyne, K. Eckardt, I. C. Macdougall, D. Tsakiris, H. Burger, A. Scherhag, and C. Investigators. 2006. Normalization of hemoglobin level in patients with chronic kidney disease and anemia. New England Journal of Medicine 355(20):2071–2084.
Feder, G., M. Eccles, R. Grol, C. Griffiths, and J. Grimshaw. 1999. Clinical guidelines: Using clinical guidelines. BMJ 318(7185):728–730.
Ferrette, C. 2008. Lyme disease expert defends research. The Journal News, May 6, 1A.
Fervers, B., J. S. Burgers, M. C. Haugh, M. Brouwers, G. Browman, F. Cluzeau, and T. Philip. 2005. Predictors of high quality clinical practice guidelines: Examples in oncology. International Journal for Quality in Health Care 17(2):123–132.
Fochtmann, L. 2010. American Psychiatric Association. Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Goldberg, P. 2008. Pragmatists vs. purists: Colon cancer screening guideline triggers debate. The Cancer Letter 34(7):1–5.
Graham, I. D., L. A. Calder, P. C. Herbert, A. O. Carter, and J. M. Tetroe. 2000. A comparison of clinical practice guideline appraisal instruments. International Journal of Technology Assessment in Health Care 16(4):1024–1038.
Grilli, R., N. Magrini, A. Penna, G. Mura, and A. Liberati. 2000. Practice guidelines developed by specialty societies: The need for a critical appraisal. The Lancet 355(9198):103–106.
Grimshaw, J., M. Eccles, and J. Tetroe. 2004. Implementing clinical guidelines: Current evidence and future implications. The Journal of Continuing Education in the Health Professions 24(Suppl 1):S31–S37.
Grol, R., F. A. Cluzeau, and J. S. Burgers. 2003. Clinical practice guidelines: Towards better quality guidelines and increased international collaboration. British Journal of Cancer 89(Suppl 1): S4–S8.
Grol, R., M. Wensing, and M. Eccles. 2005. Improving patient care: The implementation of change in clinical practice. Oxford: Elsevier Butterworth Heinemann.
Guyatt, G., E. A. Akl, J. Hirsh, C. Kearon, M. Crowther, D. Gutterman, S. Z. Lewis, I. Nathanson, R. Jaeschke, and H. Schünemann. 2010. The vexing problem of guidelines and conflict of interest: A potential solution. Annals of Internal Medicine 152(11):738–741.
Halperin, J. J., E. D. Shapiro, E. Logigian, A. L. Belman, L. Dotevall, G. P. Wormser, L. Krupp, G. Gronseth, and C. T. Bever, Jr. 2007. Practice parameter: Treatment of nervous system Lyme disease (an evidence-based review): Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology 69(1):91–102.
Hasenfeld, R., and P. G. Shekelle. 2003. Is the methodological quality of guidelines declining in the U.S.? Comparison of the quality of U.S. Agency for Health Care Policy and Research (AHCPR) guidelines with those published subsequently. Quality and Safety in Health Care 12(6):428–434.
Hayward, R. S. A., M. C. Wilson, S. R. Tunis, E. B. Bass, H. R. Rubin, and R. B. Haynes. 1993. More informative abstracts of articles describing clinical practice guidelines. Annals of Internal Medicine 118(9):731–737.
The House of Commons Health Committee. 2005. The influence of the pharmaceutical industry. Vol. 1. London, U.K.: The Stationary Office Limited. http://www.parliament.the-stationery-office.co.uk/pa/cm200405/cmselect/cmhealth/42/42.pdf (accessed April 8, 2010).
IDSA (Infectious Diseases Society of America). 2008. Agreement ends Lyme disease investigation by Connecticut attorney general. Infectious Diseases Society of America.
IDSA. 2010. News release: Special review panel unanimously upholds Lyme disease treatment guidelines: April 22, 2010. http://www.idsociety.org/Content.aspx?id=16556 (accessed November 1, 2010).
Imperiale, T. F., and D. F. Ransohoff. 2010. Understanding differences in the guidelines for colorectal cancer screening. Gastroenterology 138(5):1642–1647.
Ingelfinger, J. R. 2007. Through the looking glass: Anemia guidelines, vested interests, and distortions. Clinical Journal of the American Society of Nephrology 2(3):415–417.
IOM (Institute of Medicine). 1992. Guidelines for clinical practice: From development to use. Edited by M. J. Field and K. N. Lohr. Washington, DC: National Academy Press.
IOM. 2008. Knowing what works in health care: A roadmap for the nation. Edited by J. Eden, B. Wheatley, B. McNeil and H. Sox. Washington, DC: The National Academies Press.
IOM. 2009. Conflict of interest in medical research, education, and practice. Edited by B. Lo and M. J. Field. Washington, DC: The National Academies Press.
Jacobs, A. 2010. American College of Cardiology and American Heart Association. Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Jacques, L. B. 2010. Centers for Medicare & Medicaid Services (CMS). Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
Johnson, L., and R. B. Stricker. 2009. Attorney general forces Infectious Diseases Society of America to redo Lyme guidelines due to flawed development process. Journal of Medical Ethics 35(5):283–288.
Klein, J. O. 2008. Danger ahead: Politics intrude in Infectious Diseases Society of America guideline for Lyme disease. Clinical Infectious Diseases 47(9):1197–1199.
Koster, M. A. 2010. Technology assessment and guidelines unit: Kaiser Permanente Southern California. Paper presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, http://www.idsociety.org/Content.aspx?id=16556January 11, 2010, Washington, DC.
Kraemer, J. D., and L. O. Gostin. 2009. Science, politics, and values: The politicization of professional practice guidelines. JAMA 301(6):665–667.
Kryworuchko, J., D. Stacey, N. Bai, and I. Graham. 2009. Twelve years of clinical practice guideline development, dissemination and evaluation in Canada (1994 to 2005). Implementation Science 4(1):49.
Lederer, E. D. 2007. Development of clinical practice guidelines: Are we defining the issues too narrowly? Clinical Journal of the American Society of Nephrology 2(2):207.
Levin, A., and M. Rocco. 2007. KDOQI clinical practice guideline and clinical practice recommendations for anemia in chronic kidney disease: 2007 update of hemoglobin target. American Journal of Kidney Diseases 50(3):471–530.
Levin, B., D. A. Lieberman, B. McFarland, R. A. Smith, D. Brooks, K. S. Andrews, C. Dash, F. M. Giardiello, S. Glick, T. R. Levin, P. Pickhardt, D. K. Rex, A. Thorson, S. J. Winawer, A. C. S. C. C. Advis, U. M.-S. T. Force, and A. C. R. C. C. Com. 2008. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: A joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. CA: A Cancer Journal for Clinicians 58(3):130–160.
Lewis, S. Z. 2010. American College of Chest Physicians. Paper presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, January 11, 2010, Washington, DC.
McAlister, F., S. van Diepen. R. Padwa, J. Johnson, and S. Majumdar. 2007. How evidence-based are the recommendations in evidence-based guidelines. PLoS Medicine 4(8): 1325–1332.
McSweegan, E. 2008. Lyme disease and the politics of public advocacy. Clinical Infectious Diseases 47(12):1609–1610.
Milliman Inc. 2009. Milliman care guidelines. http://www.milliman.com/expertise/healthcare/products-tools/milliman-care-guidelines/ (accessed June 9, 2009).
New Zealand Guidelines Group. 2001. Handbook for the preparation of explicit evidence-based clinical practice guidelines. http://www.nzgg.org.nz (accessed August 26, 2009).
NICE (National Institute for Health and Clinical Excellence). 2009. Methods for the development of NICE public health guidance, 2nd ed. London, UK: NICE.
NKF (National Kidney Foundation). 2001. Guidelines for anemia of chronic kidney disease. National Kidney Foundation, Inc. http://www.kidney.org/PROFESSIONALS/kdoqi/guidelines_updates/doqiupan_ii.html (accessed January 21, 2011).
NKF. 2006. KDOQI clinical practice guidelines and clinical practice recommendations for anemia in chronic kidney disease. American Journal of Kidney Diseases 47(5, S http://www.idsociety.org/Content.aspx?id=16556uppl 3):S11–S145.
Phrommintikul, A., S. J. Haas, M. Elsik, and H. Krum. 2007. Mortality and target haemoglobin concentrations in anaemic patients with chronic kidney disease treated with erythropoietin: A meta-analysis. The Lancet 369(9559):381–388.
Pignone, M., and H. C. Sox. 2008. Screening guidelines for colorectal cancer: A twice-told tale. Annals of Internal Medicine 149(9):680–682.
Ray-Coquard, I., T. Philip, M. Lehmann, B. Fervers, F. Farsi, and F. Chauvin. 1997. Impact of a clinical guidelines program for breast and colon cancer in a French cancer center. JAMA 278(19):1591–1595.
Remuzzi, G., and J. R. Ingelfinger. 2006. Correction of anemia—payoffs and problems. New England Journal of Medicine 355(20):2144–2146.
Rosenfeld, R., and R. N. Shiffman. 2009. Clinical practice guideline development manual: A quality-driven approach for translating evidence into action. Otolaryngology–Head & Neck Surgery 140(6, Suppl 1):1–43.
Schünemann, H. J., A. Fretheim, and A. D. Oxman. 2006. Improving the use of research evidence in guideline development: Guidelines for guidelines. Health Research Policy and Systems 4:13.
Schünemann, H. J., M. Osborne, J. Moss, C. Manthous, G. Wagner, L. Sicilian, J. Ohar, S. McDermott, L. Lucas, and R. Jaeschke. 2009. An official American Thoracic Society policy statement: Managing conflict of interest in professional societies. American Journal of Respiratory Critical Care Medicine 180(6):564–580.
Shaneyfelt, T. M., and R. M. Centor. 2009. Reassessment of clinical practice guidelines: Go gently into that good night. JAMA 301(8):868–869.
Shaneyfelt, T., M. Mayo-Smith, and J. Rothwangl. 1999. Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. JAMA 281:1900–1905.
Shekelle, P. G., R. L. Kravitz, J. Beart, M. Marger, M. Wang, and M. Lee. 2000. Are nonspecific practice guidelines potentially harmful? A randomized comparison of the effect of nonspecific versus specific guidelines on physician decision making. Health Services Research 34(7):1429–1448.
Shiffman, R. N., P. Shekelle, J. M. Overhage, J. Slutsky, J. Grimshaw, and A. M. Deshpande. 2003. Standardized reporting of clinical practice guidelines: A proposal from the conference on guideline standardization. Annals of Internal Medicine 139(6):493–498.
SIGN, ed. 2008. SIGN 50: A guideline developer’s handbook. Edinburgh, Scot.: Scottish Intercollegiate Guidelines Network.
Singh, A. K., L. Szczech, K. L. Tang, H. Barnhart, S. Sapp, M. Wolfson, D. Reddan, and C. Investigators. 2006. Correction of anemia with epoetin alfa in chronic kidney disease. New England Journal of Medicine 355(20):2085–2098.
Smith, T. J., and B. E. Hillner. 2001. Ensuring quality cancer care by the use of clinical practice guidelines and critical pathways. Journal of Clinical Oncology 19(11):2886–2897.
Sniderman, A. D., and C. D. Furberg. 2009. Why guideline-making requires reform. JAMA 301(4):429–431.
Steinbrook, R. 2006. Haemoglobin concentrations in chronic kidney disease. The Lancet 368(9554):2191–2193.
Steinbrook, R. 2007. Medicare and erythropoietin. New England Journal of Medicine 356(1):4–6.
Strippoli, G. F., S. D. Navaneethan, and J. C. Craig. 2006. Haemoglobin and haematocrit targets for the anaemia of chronic kidney disease. Cochrane Database of Systematic Reviews (4):CD003967.
Tregear, M. 2007. Guideline heterogeneity; workshop—part 2. Paper presented at The Fourth Annual Guidelines International Network m Haemoglobin concentrations in chronic kidney disease eeting, August 22–25, 2007, Toronto, Canada.
Turner, T., M. Misso, C. Harris, and S. Green. 2008. Development of evidence-based clinical practice guidelines (CPGs): Comparing approaches. Implementation Science 3:45.
USRDS (United States Renal Data System). 2009. USRDS 2009 annual data report: Atlas of chronic kidney disease and end-stage renal disease in the United States. Bethesda, MD: National Institute of Diabetes and Digestive and Kidney Diseases.
Van Wyck, D., K. U. Eckardt, K. Uhlig, M. Rocco, and A. Levin. 2007. Appraisal of evidence and control of bias in the Kidney Disease Outcomes Quality Initiative guideline development process. Clinical Journal of the American Society of Nephrology 2(1):8–10.
Vlayen, J., B. Aertgeerts, K. Hannes, W. Sermeus, and D. Ramaekers. 2005. A systematic review of appraisal tools for clinical practice guidelines: Multiple similarities and one common deficit. International Journal of Quality Health Care 17(3):235–242.
Wormser, G. P., R. J. Dattwyler, E. D. Shapiro, J. J. Halperin, A. C. Steere, M. S. Klempner, P. J. Krause, J. S. Bakken, F. Strle, G. Stanek, L. Bockenstedt, D. Fish, J. S. Dumler, and R. B. Nadelman. 2006. The clinical assessment, treatment, and prevention of lyme disease, human granulocytic anaplasmosis, and babesiosis: clinical practice guidelines by the Infectious Diseases Society of America. Clinical Infectious Diseases 43(9):1089–1134.