Healthcare decision makers in search of the best evidence to inform clinical decisions have come to rely on systematic reviews (SRs) of comparative effectiveness research (CER) to learn what is known and not known about the potential benefits and harms of alternative drugs, devices, and other healthcare services. An SR is a scientific investigation that focuses on a specific question and uses explicit, prespecified scientific methods to identify, select, assess, and summarize the findings of similar but separate studies. It may include a quantitative synthesis (meta-analysis), depending on the available data. Although the importance of SRs is increasingly appreciated, the quality of published SRs is variable and often poor. In many cases, the reader is unable to judge the quality of an SR because the methods are poorly documented, and even if methods are described, they may be used inappropriately, for example, in meta-analyses. Many reviews fail to assess the quality of the underlying research and also neglect to report funding sources. A plethora of conflicting approaches to evidence hierarchies and grading schemes for bodies of evidence is a further source of confusion.
In the 2008 report, Knowing What Works in Health Care: A Roadmap for the Nation, the Institute of Medicine (IOM) recommended that methodological standards be developed for both SRs and clini-
cal practice guidelines (CPGs). The report was followed by a congressional mandate in the Medicare Improvements for Patients and Providers Act of 2008 for two follow-up IOM studies: one to develop standards for conducting SRs, and the other to develop standards for CPGs. This is the report of the IOM Committee on Standards for Systematic Reviews of Comparative Effectiveness Research. A companion report by the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines is being released in conjunction with this report.
The charge to this IOM committee was twofold: first, to assess potential methodological standards that would assure objective, transparent, and scientifically valid SRs of CER and, second, to recommend a set of methodological standards for developing and reporting such SRs (Box S-1). The boundaries of this study were defined in part by the work of the companion CPG study. The SR committee limited its focus to the development of SRs. At the same time, the CPG committee worked under the assumption that guideline developers have access to and use high-quality SRs (as defined by the standards recommended in this report).
This report presents methodological standards for SRs that are designed to inform everyday healthcare decision making, especially for patients, clinicians and other healthcare providers, and devel-
Charge to the Committee on Standards for Systematic Reviews of Comparative Effectiveness Research
An ad hoc committee will conduct a study to recommend methodological standards for systematic reviews (SRs) of comparative effectiveness research (CER) on health and health care. The standards should ensure that the reviews are objective, transparent, and scientifically valid, and require a common language for characterizing the strength of the evidence. Decision makers should be able to rely on SRs of comparative effectiveness to determine what is known and not known and to describe the extent to which the evidence is applicable to clinical practice and particular patients. In this context, the committee will:
opers of CPGs. The focus is on the development and reporting of comprehensive, publicly funded SRs of the comparative effectiveness of therapeutic medical or surgical interventions. The recent health reform legislation underscores the imperative for establishing standards to ensure the highest quality SRs. The Patient Protection and Affordable Care Act of 2010 (ACA) created the nation’s first nonprofit, public–private Patient-Centered Outcomes Research Institute (PCORI). PCORI will be responsible for establishing and implementing a research agenda—including SRs of CER—to help patients, clinicians and other healthcare providers, purchasers, and policy makers make informed healthcare decisions. As this report was being developed, planning for PCORI was underway. An initial task of the newly appointed governing board of the institute is to establish a standing methodology committee charged with developing and improving the science and methods of CER.
The IOM committee undertook its work with the intention to inform the PCORI methodology committee’s own standards development. The IOM committee also views other public sponsors of SRs of CER as key audiences for this report, including the Agency for Healthcare Research and Quality (AHRQ) Effective Health Care Program, the Centers for Medicaid and Medicare Coverage Advisory Committee, the Drug Effectiveness Research Project, the National Institutes of Health (NIH), the Centers for Disease Control and Prevention (CDC), and the U.S. Preventive Services Task Force.
PURPOSE OF SETTING STANDARDS
Organizations establish standards to set performance expectations and to promote accountability for meeting these expectations. For SRs in particular, the principal objective of setting standards is to minimize bias in identifying, selecting, and interpreting evidence. For the purposes of this report, the committee defined an SR “standard” as a process, action, or procedure that is deemed essential to producing scientifically valid, transparent, and reproducible SRs. A standard may be supported by scientific evidence, by a reasonable expectation that the standard helps achieve the anticipated level of quality in an SR, or by the broad acceptance of the practice by authors of SRs.
The evidence base for many of the steps in the SR process is sparse, especially with respect to linking characteristics of SRs to clinical outcomes, the ultimate test of quality. The committee developed its standards and elements of performance based on available research evidence and expert guidance from the AHRQ Effective
Health Care Program; the Centre for Reviews and Dissemination (CRD) (University of York, United Kingdom); the Cochrane Collaboration; the Grading of Recommendations Assessment, Development, Evaluation (GRADE) Working Group2; and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses group (PRISMA).
The committee faced a difficult task in proposing a set of standards where in general the evidence is thin and expert guidance varies. Yet the evidence that is available does not suggest that high-quality SRs can be done quickly and cheaply. SRs conducted with methods prone to bias do indeed often miss the boat, leading to clinical advice that may in the end harm patients. All of the committee’s recommended standards are based on current evidence, expert guidance, and thoughtful reasoning, and are actively used by many experts, and thus are reasonable “best practices” for reducing bias and for increasing the scientific rigor of SRs of CER. However, all of the recommended standards must be considered provisional pending better empirical evidence about their scientific validity, feasibility, efficiency, and ultimate usefulness in medical decision making.
The committee recommends 21 standards with 82 elements of performance, addressing the entire SR process, from the initial steps of formulating the topic, building a review team, and establishing a research protocol, to finding and assessing the individual studies that make up the body of evidence, to producing qualitative and quantitative syntheses of the body of evidence, and, finally, to developing the final SR report. Each standard is articulated in the same format: first, a brief statement of the step in the SR process (e.g., in Chapter 3, Standard 3.1. Conduct a comprehensive systematic search for evidence) followed by a series of elements that are essential components of the standard. These “elements” are steps that should be taken for all publicly funded SRs of CER.
Collectively the standards and elements present a daunting task. Few, if any, members of the committee have participated in an SR that fully meets all of them. Yet the evidence and experience are strong enough that it is impossible to ignore these standards or hope that one can safely cut corners. The standards will be especially valuable for SRs of high-stakes clinical questions with broad population impact, where the use of public funds to get the right answer justifies careful attention to the rigor with which the SR is conducted. Individuals involved in SRs should be thoughtful about all of the
standards and elements, using their best judgment if resources are inadequate to implement all of them, or if some seem inappropriate for the particular task or question at hand. Transparency in reporting the methods actually used and the reasoning behind the choices are among the most important of the standards recommended by the committee.
Initiating the SR Process
The first steps in the SR process define the focus and methods of the SR and influence its ultimate utility for clinical decisions. Current practice falls far short of expert guidance; well-designed, well-executed SRs are the exception. (Note that throughout this report reference to “expert guidance” refers to the published methodological advice of the AHRQ Effective Health Care Program, CRD, and the Cochrane Collaboration.) The committee recommends eight standards for initiating the SR process, minimizing potential bias in the SR’s design and execution. The standards address the creation of the SR team, user and stakeholder input, managing bias and conflict of interest (COI), topic formulation, and development of the SR protocol (Box S-2). The SR team should include individuals with appropriate expertise and perspectives. Creating a mechanism for users and stakeholders—consumers, clinicians, payers, and members of CPG panels—to provide input into the SR process at multiple levels helps to ensure that the SR is focused on real-world healthcare decisions. However, a process should be in place to reduce the risk of bias and COI from stakeholder input and in the SR team. The importance of the review questions and analytic framework in guiding the entire review process demands a rigorous approach to formulating the research questions and analytic framework. Requiring a research protocol that prespecifies the research methods at the outset of the SR process helps to prevent the effects of author bias, allows feedback at an early stage in the SR, and tells readers of the review about protocol changes that occur as the SR develops.
Finding and Assessing Individual Studies
The committee recommends six standards for identifying and assessing the individual studies that make up an SR’s body of evidence, including standards addressing the search process, screening and selecting studies, extracting data, and assessing the quality of individual studies (Box S-3). The objective of the SR search is to identify all the studies (and all the relevant data from the studies) that
may pertain to the research question and analytic framework. The search should be systematic, use prespecified search parameters, and access an array of information sources that provide both published and unpublished research reports. Screening and selecting
Standard 2.6 Develop a systematic review protocol
Standard 2.7 Submit the protocol for peer review
Standard 2.8 Make the final protocol publicly available, and add any amendments to the protocol in a timely fashion
studies should use methods that address the pervasive problems of SR author bias, errors, and inadequate documentation of the study selection process in SRs. Study methods should be reported in sufficient detail so that searches can be replicated and appraised. Qual-
Recommended Standards for Finding and Assessing Individual Studies
Standard 3.1 Conduct a comprehensive systematic search for evidence
Standard 3.2 Take action to address potentially biased reporting of research results
Standard 3.3 Screen and select studies
Standard 3.4 Document the search
Standard 3.5 Manage data collection
Standard 3.6 Critically appraise each study
ity assurance and control are essential when data are extracted from individual studies from the collected body of evidence. A thorough and thoughtful assessment of the validity and relevance of each eligible study helps ensure scientific rigor and promote transparency.
Synthesizing the Body of Evidence
The committee recommends four standards for the qualitative and quantitative synthesis and assessment of an SR’s body of evidence (Box S-4). The qualitative synthesis is an often undervalued component of an SR. Many SRs lack a qualitative synthesis altogether or simply recite the facts about the studies without examining them for patterns or characterizing the strengths and weaknesses
Recommended Standards for Synthesizing the Body of Evidence
Standard 4.1 Use a prespecified method to evaluate the body of evidence
Standard 4.2 Conduct a qualitative synthesis
of the body of evidence as a whole. If the SR is to be comprehensible, it should use consistent language to describe the quality of evidence for each outcome and incorporate multiple dimensions of study quality. For readers to have a clear understanding of how the evidence applies to real-world clinical circumstances and specific patient populations, SRs should describe—in easy-to-understand language—the clinical and methodological characteristics of the individual studies, including their strengths and weaknesses and their relevance to particular populations and clinical settings. It should also describe how flaws in the design or execution of the individual studies could bias the results. The qualitative synthesis is more than a narrative description or set of tables that simply detail how many studies were assessed, the reasons for excluding other
Standard 4.3 Decide if, in addition to a qualitative analysis, the systematic review will include a quantitative analysis (meta-analysis)
Standard 4.4 If conducting a meta-analysis, then do the following:
NOTE: The order of the standards does not indicate the sequence in which they are carried out.
studies, the range of study sizes and treatments compared, or the quality scores of each study as measured by a risk of bias tool.
Meta-analysis is the statistical combination of results from multiple individual studies. Many published meta-analyses have combined the results of studies that differ greatly from one another. The assumption that a meta-analysis is an appropriate step in an SR should never be made. The decision to conduct a meta-analysis is neither purely analytical nor statistical in nature. It will depend on a number of factors, such as the availability of suitable data and the likelihood that the analysis could inform clinical decision making. Ultimately, authors should make this subjective judgment in consultation with the entire SR team, including both clinical and methodological perspectives. If appropriate, the meta-analysis can provide reproducible summaries of the individual study results and offer valuable insights into the patterns in the study results. A strong meta-analysis features and clearly describes its subjective components, scrutinizes the individual studies for sources of heterogeneity, and tests the sensitivity of the findings to changes in the assumptions, the set of included studies, the outcome metrics, and the statistical models.
The Final Report
Authors of all publicly sponsored SRs should produce a detailed final report. The committee recommends three standards for producing the SR final report: (1) including standards for documenting the SR process; (2) responding to input from peer reviewers, users, and stakeholders; and (3) making the final report publicly available (Box S-5). The committee’s standards for documenting the SR process drew heavily on the PRISMA checklist. The committee recommends adding items to the PRISMA checklist to ensure that the report of an SR describes all of the steps and judgments required by the committee’s standards (Boxes S-2, S-3, and S-4).
The evidence base supporting many elements of SRs is incomplete and, for some steps, nonexistent. Research organizations such as the AHRQ Effective Health Care Program, CRD, and the Cochrane Collaboration have published standards, but none of these are universally accepted and consistently applied during planning, conducting, reporting, and peer review of SRs. Furthermore, the SR enterprise in the United States lacks both adequate funding and
coordination; many organizations conduct SRs, but do not typically work together. Thus, the committee concludes that improving the quality of SRs will require improving not only the science supporting the steps in the SR process (Boxes S-2, S-3, and S-4), but also providing a more supportive environment for the conduct of SRs. The committee proposes a framework for improving the quality of the science underpinning SRs and supporting the environment for SRs. The framework has several broad categories: strategies for involving the right people, methods for conducting reviews, methods for synthesizing and evaluating evidence, and methods for communicating and using results.
The standards and elements form the core of the committee’s conclusions, but the standards themselves do not indicate how the standards should be implemented, nor do the standards address issues of improving the science for SRs or for improving the environment that supports the development and use of an SR enterprise. In consequence, the committee makes the following two recommendations:
Recommendation 1: Sponsors of SRs of CER should adopt appropriate standards for the design, conduct, and reporting of SRs and require adherence to the standards as a condition for funding.
SRs of CER in the United States are now commissioned and conducted by a vast array of private and public entities, some supported generously with adequate funding to meet the most exacting standards, others supported less generously so that the authors must make compromises at every step of the review. The committee recognizes that its standards and elements are at the “exacting” end of the continuum, some of which are within the control of the review team whereas others are contingent on the SR sponsor’s compliance. However, high-quality reviews require adequate time and resources to reach reliable conclusions. The recommended standards are an appropriate starting point for publicly funded reviews in the United States (including PCORI, federal, state, and local funders) because of the heightened attention and potential clinical impact of major reviews sponsored by public agencies. The committee also recognizes that authors of SRs supported by public funds derived from nonfederal sources (e.g., state public health agencies) will see these standards as an aspirational goal rather than as a minimum requirement. SRs that significantly deviate from the standards should clearly explain and justify the use of different methods.
Recommended Standards for Reporting Systematic Reviews
Standard 5.1 Prepare final report using a structured format
Recommendation 2: The Patient-Centered Outcomes Research Institute and the Department of Health and Human Services (HHS) agencies (directed by the secretary of HHS) should collaborate to improve the science and environment for SRs of CER. Primary goals of this collaboration should include
Developing training programs for researchers users, consumers, and other stakeholders to encourage more effective and inclusive contributions to SRs of CER;
Systematically supporting research that advances the methods for designing and conducting SRs of CER;
Standard 5.2 Peer review the draft report
Standard 5.3 Publish the final report in a manner that ensures free public access
Supporting research to improve the communication and use of SRs of CER in clinical decision making;
Developing effective coordination and collaboration between U.S. and international partners;
Developing a process to ensure that standards for SRs of CER are regularly updated to reflect current best practice; and
Using SRs to inform priorities and methods for primary CER.
This recommendation conveys the committee’s view of how best to implement its recommendations to improve the science and sup-
port the environment for SRs of comparative effectiveness research, which is clearly in the public’s interest. PCORI is specifically named because of its statutory mandate to establish and carry out a CER research agenda. As noted above, it is charged with creating a methodology committee that will work to develop and improve the science and methods of SRs of CER and to regularly update such standards. PCORI is also required to assist the Comptroller General in reviewing and reporting on compliance with its research standards, the methods used to disseminate research findings, the types of training conducted and supported in CER, and the extent to which CER research findings are used by healthcare decision makers. The HHS agencies are specifically named because AHRQ, NIH, CDC, and other sections of HHS are major funders and producers of SRs. In particular, the AHRQ EPC program has been actively engaged in coordinating high-quality SRs and in developing SR methodology. The committee assigns these groups with responsibility and accountability for coordinating and moving the agenda ahead.
The committee found compelling evidence that having high-quality SRs based on rigorous standards is a topic of international concern, and that individual colleagues, professional organizations, and publicly funded agencies in other countries make up a large proportion of the world’s expertise on the topic. Nonetheless, the committee followed the U.S. law that brought this report into being, which suggests a management approach appropriate to the U.S. environment. A successful implementation of the final recommendation should result in an enterprise in the United States that participates fully and harmonizes with the international development of SRs, serving in some cases in a primary role, in others as a facilitator, and in yet others as a participant. The new enterprise should recognize that this cannot be entirely scripted and managed in advance—structures and processes must allow for innovation to arise naturally from those individuals and organizations in the United States already fully engaged in the topic.