National Academies Press: OpenBook
« Previous: 4 Lessons Learned and Best Practices
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 53
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 54
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 55
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 56
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 57
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 58
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 59
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 60
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 61
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 62
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 63
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 64
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 65
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 66
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 67
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 68
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 69
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 70
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 71
Suggested Citation:"5 Checklists and Guidelines." National Academies of Sciences, Engineering, and Medicine. 2020. Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25627.
×
Page 72

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

5 Checklists and Guidelines Highlights of Key Points Made by Individual Speakers • Checklists can improve transparent reporting and impact re- search practice, but endorsement by journals is insufficient. Checklists should be mandatory and compliance must be monitored even though this approach adds burden on au- thors, and is resource intensive for journals to implement. (Macleod, Swaminathan) • Not all checklist items are relevant for all conditions, and there is often a lack of agreement by checklist assessors when evaluating compliance of a manuscript. (Macleod) • Checklist items should be prioritized and pilot tested to determine whether they are meaningful for the end users. (Goodman, Silberberg) • Although transparent reporting occurs at the end of the re- search process, there is a need to improve the rigor of research from the start. (Coller, Swaminathan) • Few institutions provide formal training in the design and conduct of research. Developing a free, comprehensive, modular, adaptable, and upgradable educational resource would eliminate the need for institutions to invest time and resources in creating their own. (Silberberg) 53 PREPUBLICATION COPY—Uncorrected Proofs

54 ENHANCING SCIENTIFIC REPRODUCIBILITY • Grassroots initiatives and communities of champions are needed to support culture change by all stakeholders. (Col- ler, Silberberg) • Evaluating research quality initiatives is important to show that an intervention is achieving the intended outcome. (Macleod) Among the best practices discussed during the second panel session were guidelines and checklists (see Chapter 4). In this session, panelists delved further into the practical application and effectiveness of guide- lines and checklists for enhancing transparent reporting of biomedical research (see Box 5-1 for the corresponding workshop session objectives). Sowmya Swaminathan, head of editorial policy and research integ- rity at Nature Research, and Malcolm Macleod discussed the impacts of several current checklists and provided an overview of the Minimum Standards Working Group’s development and pilot testing of the materi- als, design, analysis, and reporting (MDAR) framework and checklist. Shai Silberberg discussed the uptake and effectiveness of checklists and strategies for improving adherence. The session was moderated by Barry BOX 5-1 Workshop Session Objectives • Discuss journal and funder assessments of researchers’ adherence to trans- parent reporting guidelines, including discussion of the effectiveness of checklists. ° Highlight empirical assessments of checklist application from funders, journals, and researchers; and ° Consider practical application and effectiveness of checklists and guide- lines to encourage or require transparent reporting of preclinical bio- medical research. • Discuss how funders could require thoughtful discussion in grant applica- tions of how uncertainties will be evaluated, along with any relevant issues regarding replicability and computational reproducibility (Reproducibility and Replicability in Science Recommendation 6-9). •  iscuss how journals and scientific societies could disclose their policies D relevant to achieving reproducibility and replicability, and how journals could be encouraged to set and implement desired standards of reproducibility and replicability and adopt policies to reduce the likelihood of non-replica- bility (Reproducibility and Replicability in Science Recommendation 6-7). SOURCE: Workshop agenda (available in Appendix C), September 25, 2019. PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 55 Coller, physician in chief, vice president for medical affairs, and David Rockefeller Professor at The Rockefeller University. To open the panel session, Coller described an early example of the successful use of a checklist from The Checklist Manifesto by Atul Gawande (2009). In 1935, the U.S. Army held a competition to award a contract for production of a new long-range bomber. Boeing’s entry, the Model 299 (later designated the B-17), was superior to the other entries in key areas such as design, payload capacity, and performance. However, during an evaluation flight, the anticipated winner of the contract climbed, stalled, and crashed, killing the pilot and a crewmember. The investigation con- cluded that the crash was the result of “pilot error due to an unprec- edented complexity” of the plane, Coller said. The experienced pilot had failed to release the lock on the elevator and rudder controls, and it was said at the time that the Model 299 was “too much plane for one man to fly,” Coller relayed. Boeing lost the contract and came close to bankruptcy. Still interested in the technology, the Army purchased several Model 299s and worked with test pilots to improve safety. As the original test pilot was highly trained, it was concluded that additional training was not the answer. The solution they reached, Coller said, was to create a concise, step- by-step checklist for takeoff, landing, and taxiing that would fit on an index card. Ultimately, nearly 13,000 B-17 bombers were built, and pilots logged 1.8 million miles without any further accidents. Coller listed several of the lessons learned about flying a B-17 safely in 1935 and adapted them to a performing and reporting science in 2019. First, he said the alignment of incentives for flying the plane safely are absolute because not flying safely can result in death. For performing and reporting science, the alignment of incentives is “more nuanced and subtle.” Coller described the complexity of flying a plane safely in 1935 as analogue (e.g., dials, binary switches), while science today exists in the digital world. Although pilots needed to make many decisions to fly the B-17 safely, the number of decisions was finite; however, he said the number of decisions involved in performing and reporting science today is “virtually infinite.” A B-17 pilot’s dependence on others involved a limited team, while per- forming and reporting science depends on a greatly expanded universe of others. Finally, Coller said, the dependence on “black boxes” by pilots in 1935 was finite and he noted they could actually “kick the tires.” In science, what happens in the black boxes can be vital, and it is increasingly difficult to know the quality (e.g., an error in one line of code in one algorithm can have far-reaching effects if that algorithm is used widely).1 1 A black box in the context of the sciences refers to part of a process or pathway between the inputs and the outputs for which the mechanisms are unknown or are not well understood by the user. PREPUBLICATION COPY—Uncorrected Proofs

56 ENHANCING SCIENTIFIC REPRODUCIBILITY CHECKLIST IMPLEMENTATION BY LIFE SCIENCE JOURNALS: TOWARD MINIMUM REPORTING STANDARDS FOR RESEARCH Sowmya Swaminathan, Head, Editorial Policy and Research Integrity, Nature Research Malcolm Macleod, Professor of Neurology and Translational Neuroscience, University of Edinburgh Swaminathan and Macleod described several examples of checklist initiatives leading up to the creation of the Minimum Standards Working Group, a group of journal editors and experts on reproducibility that has developed minimum standards for reporting in life sciences.2 As background, Macleod shared an example of how poor preclinical study quality can lead to bias in published animal studies, resulting in serious implications for translation to clinical trials. The neuroprotec- tive drug, NXY-059, was shown to be efficacious in animal studies, but the drug was ineffective in a large clinical trial. A systematic review of the published animal studies revealed that, although the overall ani- mal data supported the efficacy conclusion, the majority of the studies did not report randomization, blinded conduct of the experiment, and blinded outcome assessment. The few studies that were of high quality (randomized and blinded) reported significantly lower treatment efficacy (Macleod et al., 2008). To understand the scale of the problem, Macleod and colleagues assessed the publications included in the Research Assessment Exer- cise, which evaluated the quality of research at five leading UK insti- tutions. More than 1,000 publications involving animal research were assessed for their reporting of the four key items recommended by Lan- dis and colleagues as the minimum necessary for transparent reporting: blinding, reporting inclusion and exclusion criteria, randomization, and sample size calculation (Landis et al., 2012). Macleod found that less than 20 percent reported blinding, 10 percent reported inclusion and exclusion criteria, 15 percent of the papers reported randomization, and 2 percent reported power calculations. Overall, he said, 68 percent of the papers assessed reported none of these elements, and one paper reported doing all of them. These types of examples have led to a range of initiatives to improve research, including various guidelines and checklists. 2 The formation of the working group is described here: https://osf.io/preprints/ metaarxiv/9sm4x (accessed December 14, 2019). PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 57 Checklists as a Solution Nature Journals Reporting Checklist for Life Science Papers In 2013, Nature Research announced the implementation of measures to improve the reporting of life science research published in its journals.3 A key component of this initiative, Swaminathan said, was the develop- ment of a reporting checklist that authors are now required to include with their manuscript submission.4 The checklist helps to facilitate more complete reporting of study details, establishes expectations for reporting of statistics, and provides journal policies on the sharing of data and code. The author’s completed checklist is provided to the peer reviewers and journal editors who monitor compliance. A 2017 survey of authors who had published in Nature journals found that 83 percent of respondents said “the checklist had significantly improved reporting of statistics within papers published in Nature jour- nals,” Swaminathan said. Respondents also perceived improved reporting of reagents and animal models, and increased data deposition in pub- lic repositories (see Figure 5-1, panel A). Although the primary goal of implementing the checklist was to improve reporting quality in published papers, Swaminathan said that it was hoped that it might also raise aware- ness and impact research practice. In this regard, 78 percent of respondents said they continue to use the checklist to some extent in their own work, regardless of planned journal submission (see Figure 5-1, panel B).5 A B FIGURE 5-1 Impact of checklist on published papers and research practice. SOURCES: Swaminathan presentation, September 25, 2019, from the 2017 survey of published Nature journal authors (Nature Research, 2018, and footnote 5). 3 See https://www.nature.com/news/polopoly_fs/1.12852!/menu/main/topColumns/ topLeftColumn/pdf/496398a.pdf (accessed November 20, 2019). 4 Available at https://media.nature.com/full/nature-assets/ncomms/authors/ncomms_ lifesciences_checklist.pdf (accessed November 20, 2019). 5 For complete data and related materials, see https://figshare.com/articles/Nature_ Reproducibility_survey_2017/6139937 (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

58 ENHANCING SCIENTIFIC REPRODUCIBILITY Summarizing the experience with the Nature Research journals checklist, Swaminathan said the use of checklists can improve report- ing standards and impact research practice, but she emphasized that checklists need to be mandatory and compliance must be monitored. She acknowledged that mandates pose additional burdens for authors, and monitoring compliance is resource intensive. In addition, research- ers must contend with a wide diversity of policies from their institutions, funders, and publishers. Journals are “at the end of the process,” she said, and achieving a broad shift in research practice will require initiatives targeting the beginning, within laboratories and academic institutions. Nature Publishing Group Quality in Publication (NPQIP) Study Another study, described by Macleod, assessed the impact of the Nature Research reporting checklist for life science papers. The study evaluated reporting quality in published papers that had been submit- ted after the policy requiring checklist completion was implemented by Nature, compared with reporting quality in publications that had been submitted to Nature journals before policy implementation, and also to similar papers published in other (non-Nature) journals. Macleod reported that there were substantial increases in reporting of all four of the items identified by Landis (blinding, reporting inclusions and exclusions, ran- domization, and sample size calculation) after the requirement for check- list submission was implemented by Nature Publishing Group (NPQIP Collaborative Group, 2019). NPQIP demonstrates that “a checklist, on its own, is not enough,” Macleod said. Minimum Standards Working Group The Minimum Standards Working Group includes editors and experts in reproducibility from Nature Research, the Public Library of Science, Science/American Association for the Advancement of Science, Cell Press, eLIFE, Wiley, the Center for Open Science, and the University of Edinburgh. The aim of the working group was to “improve transpar- ency and reproducibility by defining minimum reporting standards in life sciences,” which Swaminathan said includes biological, biomedical, and preclinical research. She added that the working group, assembled in 2017, was inspired by the success of the International Committee of Medi- cal Journal Editors in influencing clinical trial reporting and the impact of the Consolidated Standards of Reporting Trials (CONSORT) checklist. The working group consulted with external experts and stakeholders and referenced existing journal checklists and policy frameworks (includ- ing the Nature Research checklist, EQUATOR Network guidelines, and PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 59 TOP guidelines mentioned in Appendix B, and others). The work was also informed by meta-research on the implementation of checklists and the National Academies consensus study reports Reproducibility and Replicabil- ity in Science (NASEM, 2019) and Open Science by Design (NASEM, 2018). The working group issued the following three key outputs: • A minimum standards framework, which establishes minimum expectations of transparency across the core areas of materials, design, analysis, and reporting; • A minimum standards checklist, which is an implementation tool to facilitate compliance with the framework; and • An elaboration document, which provides context for the mini- mum standards framework and guidance for using the checklist. The three documents have been publicly released as the MDAR Framework, the MDAR Checklist for Authors, and the MDAR Framework and Checklist Elaboration Document, and Swaminathan encouraged par- ticipants to provide feedback.6 The target audiences for the deliverables are journals and publishing platforms, as well as research institutions, funders, and other stakeholders, she said. The framework and checklist are broadly applicable across the research life cycle, from study design and grant submission through to manuscript submission, peer review, and publication, and are also intended as a teaching tool. MDAR Framework Elements Swaminathan elaborated on the four reporting categories of the MDAR framework, listing the key elements that the working group iden- tified for each: • “Materials: biological reagents, lab animals, model organisms, ani- mals in the field, unique specimens • Design: study/experimental design, protocols, statistics, method- ologies, dual-use research consent • Analysis: data, code, statistics as relevant to analysis • Reporting: discipline-specific guidelines and standards.” The framework also discusses two levels of reporting, the “mini- mum” required level and a recommended “best practice” level, both of 6 The three key outputs of the Minimum Standards Working Group are available at https://osf.io/xfpn4, https://osf.io/bj3mu (accessed November 20, 2019), and https://osf. io/xzy4s, respectively (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

60 ENHANCING SCIENTIFIC REPRODUCIBILITY which include information on the accessibility and unambiguous iden- tification of the elements reported. By adopting the MDAR framework, Swaminathan said, a stakeholder is committing to incorporate the mini- mum standards into their policies. MDAR Checklist Pilot Testing Author and Editor Perceptions Survey The first objectives of the MDAR pilot test were to collect authors’ and editors’ perceptions of the checklist (e.g., usefulness, accessibility, missing elements, impact on manuscript processing times). Surveys were done of editors from 13 journals,7 and of 211 authors completing checklists (see Figure 5-2). Swaminathan summarized that the majority of authors found the checklist tool to be helpful, with 44 percent of authors respond- ing “very helpful” and 36 percent responding “somewhat helpful.”8 The FIGURE 5-2 Editor and author experiences with the MDAR checklist. NOTE: MDAR = materials, design, analysis, and reporting. SOURCES: Swaminathan presentation, September 25, 2019. This figure is taken from the presentation, “Summary results of author and editor responses. MDAR working group, September 2019,” available at https://osf.io/znq64 (accessed December 14, 2019). 7 BMC Microbiology, Ecology & Evolution, eLife, EMBO journals, Epigenetics, F1000R, Mo- lecular Cancer Therapeutics, Microbiology Open, PeerJ, PLOS Biology, PNAS, Science, Scientific Reports. 8 Complete data from the author and editor surveys are available at https://osf.io/gqsmp (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 61 majority of editors also found the MDAR checklist helpful, with 31 per- cent of editors responding “very helpful” and 53 percent responding “somewhat helpful.” Evaluation of the MDAR Checklist Experience Macleod described another evaluation of 289 manuscripts submitted to the same 13 journals. On average, editors spent 24 minutes assessing checklist performance. Only 15 of the 42 items on the checklist were rel- evant for more than 50 percent of the manuscripts. Macleod noted that this is not unexpected for an overarching guideline. For example, he said, a checklist item about plant-based research would not be relevant to stud- ies that do not involve plants. Editors assessed the relevance of checklist items in the areas of materials, design, analysis, and reporting, and the extent to which they believed authors had complied. Macleod observed that there were many areas where assessors believed an item was highly relevant, but determined that few authors were reporting those items, and vice versa, where items were deemed to be irrelevant, but were highly reported. Eighty-nine of the manuscripts were dual-evaluated by two inde- pendent assessors and agreement was determined using Kappa statistics (which Macleod explained subtracts chance agreement). Assessors agreed on the relevance of some checklist items, but for others “the agreement was not much better than [it] is by chance alone,” Macleod said. Similarly, for checklist items that both assessors agreed were relevant, there was not necessarily agreement on whether the manuscript had met the guideline criteria for reporting of the item. He noted that the confidence intervals for the Kappa scores were wide. MDAR Experience Summary and Next Steps Macleod summarized that “authors and editors seem to like the checklist and find it useful,” and “the time taken to check performance is short.” He suggested that spending more time per checklist might have resulted in greater agreement among the dual assessors. Not all checklist items are relevant at all times, and a “dynamic checklist” that offers fields relevant for a particular journal, for example, might be useful. Agreement between assessors was limited, and confidence intervals were wide, but the areas of disagreement should be highlighted as areas to focus on with regard to clarity of the checklist item and the information provided in the elaboration and explanation document. The next steps in the development of the MDAR checklist, Macleod and Swaminathan said, will be a consultation with key stakeholders and PREPUBLICATION COPY—Uncorrected Proofs

62 ENHANCING SCIENTIFIC REPRODUCIBILITY interested individuals, and revisions of the checklist and supporting materials per the feedback. APPROACHES TO IMPROVE ADHERENCE TO CHECKLISTS AND GUIDELINES9 Shai Silberberg, Director of Research Quality, National Institute of Neurological Disorders and Stroke By definition, Silberberg said, a guideline is “a general rule, principle, or piece of advice.” In other words, a guideline is a suggestion to be taken into consideration for the longer term, he said, and guidelines are not generally effective at changing behavior. A checklist, by definition, is “a list of items required, things to be done, or points to be considered, used as a reminder.” Checklists are for the specific task at hand, he said. Less Is More A systematic review published in 2012 compared the completeness of reporting of randomized controlled trials in journals that had endorsed CONSORT versus those that had not (Turner et al., 2012). The most significant difference found was for the checklist item allocation con- cealment, which was reported adequately in 45 percent of the trials in the journals that endorsed CONSORT versus 22 percent of the trials in non-endorsing journals. While twice the reporting rate is impressive improvement, Silberberg pointed out that more than 50 percent of the trials published in the CONSORT-endorsing journals did not follow the guidelines. He suggested that the length of the checklist is a contributing factor to this noncompliance. Allocation concealment is very important for reducing selection bias, but it is just one of 38 items on the 2010 CONSORT. When faced with a long checklist, Silberberg said, important items can be overlooked. “Less can be more,” Silberberg said, and he discussed the need to stage priorities. The CONSORT checklist was implemented more than two decades ago, and yet complete reporting of trial information is still lacking. He suggested a more concise checklist, “a minimum set of items that are the most crucial not to ignore.” After researchers are trained in the highest priority elements and have adopted the desired behaviors, they should then implement the next set of priority items, and so forth, continuing to stage introduction over the longer term, he suggested. 9 Silberberg stated that the opinions expressed in his presentations are his own and are not official opinions of the National Institutes of Health. PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 63 As recommended by Landis and colleagues, “At a minimum, studies should report on sample size estimation, whether and how animals were randomized, whether investigators were blind to the treatment, and the handling of data” (Landis et al., 2012, p. 187). Silberberg said there is a high risk of bias associated with these items and they form the foundation of a rigorous study. He emphasized that, if the data are of poor quality, then all that follows (analysis, reporting) are of little value. Shared Responsibility for Cultural Change Silberberg noted that the Landis publication on transparent report- ing was an output of a 2012 National Institute of Neurological Disorders and Stroke (NINDS) stakeholder workshop in improving the design and reporting of animal studies (Landis et al., 2012). Another conclusion of the 2012 NINDS workshop, he said, was that all stakeholders share responsibility. He observed that the discussions at this National Acad- emies workshop have also emphasized the roles of all stakeholders, and he encouraged participants to ask themselves what they can contribute to creating change and promoting transparent reporting. The need for change in the research culture has been raised through- out this National Academies workshop, Silberberg said, and changing the culture requires education and a change to the incentive structure. NINDS convened a workshop in 2018 to evaluate the extent to which scientists receive formal training in the design and conduct of research.10 A survey of 41 institutions with NINDS-funded training grants found that only 5 offered a full-length course on the principles of rigorous research, and Silberberg said that all of the elements of rigorous research cannot be covered in one course. Other institutions provided lectures (17) or mini- courses (2), but 12 provided no formal training (and 5 did not respond to the survey). In considering why so few institutions provide formal training in research principles, Silberberg suggested that building an educational program “from scratch” is difficult and requires a significant investment of time, knowledge and expertise, motivation, and funding. He proposed the development of a free educational resource that institutions could use for their own training programs, eliminating the need to invest energy and resources in creating their own. The resource would be comprehen- sive, modular, adaptable, and upgradable. 10 See https://www.ninds.nih.gov/News-Events/Events-Proceedings/Events/Visionary- Resource-Instilling-Fundamental-Principles-Rigorous (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

64 ENHANCING SCIENTIFIC REPRODUCIBILITY Creating Communities of Champions Silberberg summarized the recommendations from the 2018 NINDS workshop. First, “an effective educational platform should target all career stages.” Silberberg noted that the first panel of this National Academies workshop emphasized the importance of targeting trainees and early career scientists in particular. He added that many senior investigators also need training on the principles of rigorous research. Next, a culture change is needed at all levels of academic, publish- ing, and funding organizations. This includes a change to the incentive structure. Third, “academic institutions need to play a proactive role in changing the culture,” Silberberg said. He noted that institutions are often missing from the discussions of research culture, and tenure, pro- motion, and hiring committees within institutions play a central role in the research culture. Attendees at the 2018 NINDS workshop suggested that to achieve these goals, a “grassroots effort” is needed, Silberberg said. They called for “the establishment of communities of champions within and across institutions to share resources, change culture, and support better training at all academic levels.” Each stakeholder organization has champions for culture change, Silberberg said. To bring them together, NINDS has created a mechanism on its website for champions of rigorous research practices to self-identify and connect with others in their institution.11 NINDS is currently consid- ering how best to support interactions of these communities of champions with others regionally, nationally, and even globally. In closing, Silberberg shared an example of how communities of champions can foster improved transparency of presentations at scien- tific meetings (Silberberg et al., 2017). A short conference talk or poster does not generally allow for sufficient depth of information for attend- ees to have a sense of the rigor of the work. One approach to increasing transparency, Silberberg said, is to provide more detail in figures (e.g., individual datapoints, total number of samples). Another approach, he said, is to add symbols or “rigor emojis” to the figures to indicate that the study was, for example, randomized, blinded, or confirmatory (see Figure 5-3). 11 See https://www.ninds.nih.gov/Current-Research/Trans-Agency-Activities/Rigor- Transparency/RigorChampionsAndResources (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 65 FIGURE 5-3 Proposed strategy to improve transparent reporting in conference posters. SOURCES: Silberberg presentation, September 24, 2019; Silberberg et al., 2017. DISCUSSION Motivating Action: Champions for Culture Change During the discussion, panelists expanded on the topic of the need for champions of culture change, including the need for grassroots efforts and what motivates stakeholders to take action. Institutional Leadership Coller observed that, although the focus of the workshop is transpar- ent reporting, it has been noted throughout the discussions that report- ing is the end of the process, and there should be more attention to improving the rigor of research from the start. He asked panelists what institutional leaders should be doing to promote rigorous science. He supported the concept of a community of champions, as discussed by Silberberg, and proposed the creation of a “research integrity advocate” that would be comparable to the research participant advocate position established at NIH-funded clinical research centers. The research integ- PREPUBLICATION COPY—Uncorrected Proofs

66 ENHANCING SCIENTIFIC REPRODUCIBILITY rity advocate could be the institutional champion tasked with promoting culture change, he said. Macleod noted that many of the institutions that have established practices to promote a rigorous research culture have been highly moti- vated by the need to address an incident of research malpractice. He sug- gested that focusing efforts primarily on addressing misconduct and pre- venting falsification, fabrication, and plagiarism is the wrong approach, and institutions should instead emphasize improving research perfor- mance broadly for the benefit of all. Silberberg countered the comment that reporting is the end of the pro- cess. Science is a continuous cycle he said, and journals might be at the end of one round, but they are the beginning of the next round in that publications are then used to justify the next grant proposal or the next research plan. He agreed that journals are not responsible for enforcing rigorous science, but they have a role to play in increasing transparency. As an example, he mentioned the checklist required by Nature journals that allows peer review- ers to better assess the rigor of the study being reported. He suggested that major stakeholders are often hesitant to be the first to take action because of the potential financial ramifications. For example, university leadership will continue to push investigators to publish frequently in high-profile journals, potentially at the expense of rigor, because that is what is valued and rewarded by grant reviewers. He reiterated the need for champions and grassroots initiatives to push for change by all the major stakeholders. Grassroots Stakeholder Efforts As an example of a grassroots approach to championing rigorous sci- ence, Macleod mentioned the UK Reproducibility Network. The network includes self-organized local groups of early career researchers who connect for mentoring and journal clubs that promote openness and reproducibility; stakeholders (e.g., journals, funders); and academic institutions. Macleod added that academic institutions seeking to join must formally commit to promoting rigorous research and must appoint an Academic Lead for Research Improvement that is a senior-level position. Joining the UK Repro- ducibility Network, he said, provides a mechanism for local communities of early career researchers and their institutions to commit to creating an improved research culture without waiting to be motivated by a public misconduct scandal. Kelly Dunham, senior manager for strategic initiatives at the Patient- Centered Outcomes Research Institute (PCORI), described the Ensur- ing Value in Research Funders Forum as an example of a grassroots PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 67 stakeholder effort.12 The forum, established in 2016, includes about 40 funding organizations and has issued a consensus statement and guid- ing principles intended to maximize the probability of impact of the research funded, she explained. The forum is working to “characterize what good practice looks like,” she said, and to develop minimum stan- dards and recommendations for the larger biomedical research ecosystem. She said there is occasional pushback from some funders that a particular approach is not practicable, and she emphasized the importance of shar- ing examples of successes learning from each other. Dunham said funders can play a role in ensuring the results of the studies they fund are made publicly available, and PCORI has taken on the responsibility of provid- ing transparent and well-documented results to the public. PCORI has a process of peer review for all of its funded research and publicly releases lay summaries and clinical summaries of studies. Cindy Sheffield, project manager of the Alzheimer’s Disease Preclini- cal Efficacy Database (AlzPED) for NIH, referred participants to AlzPED, which she said currently includes about 900 articles “that have been evaluated for 24 elements of experimental design.”13 A goal of AlzPED is to create awareness and work toward changing the culture. She said that although they cannot evaluate rigor and transparency quantitatively, they do check and record in the database whether the elements of design are reported or not. The Role of the Investigator Swaminathan observed that stakeholders are increasingly aware of the problem of irreproducibility and seem interested in taking action. She noted that a survey of researchers found that they believe researchers are responsible for addressing issues of reproducibility, but a supportive institutional infrastructure (e.g., training, mentoring, funding, publishing) is needed. Silberberg said a senior scientist can have difficulty accepting that the work they have done over the past several decades might not have been of the highest quality. Coller added that having buy-in from senior inves- tigators is essential to effect culture change, and senior investigators are needed as champions as well. Training grants are important, he said, but institutional culture is not defined by trainees and early career investiga- tors. Arturo Casadevall called on the scientific elite to step up and require rigorous research. “Most scientists today want to do rigorous good sci- 12 See https://sites.google.com/view/evir-funders-forum/home (accessed November 20, 2019). 13 See https://alzped.nia.nih.gov (accessed November 20, 2019). PREPUBLICATION COPY—Uncorrected Proofs

68 ENHANCING SCIENTIFIC REPRODUCIBILITY ence, and the problem is they are caught in a system in which they are not judged by it,” he said. As discussed, much of the effort to change the culture of research has been from editors and grassroots efforts, he said, and few leading scientists have taken a stand on this issue. He lamented that the most respected scientific leaders “are often the most silent,” and are hesitant to criticize the system through which they have come up. Cross-Sector Coordination John Gardinier agreed with the emphasis on changing the research culture and noted in particular the need to address the impact of silos in research, including the potential for conflicting information being pub- lished by different scientific disciplines. Macleod said he had experience with different disease research communities each asserting that the others had issues with research rigor, but they did not. He said he encourages them to do a systematic review of the quality of reporting in their field, and many come to the conclusion that they do, in fact, have a problem. Checklists and Study Design Swaminathan said there is still a lack of general consensus regarding what is a “good study.” She said she and others believe the four items recommended by Landis and colleagues “not only should be reported, but should be incorporated into study design.” Most studies, however, do not incorporate these elements, and if these elements are reported in a publication, it is generally due to an enforcement and compliance mecha- nism. Coller agreed and observed that there is often agreement on what constitutes “bad science” and perhaps driving consensus on what is bad research form is a place to start. Veronique Kiermer said that the ultimate goal is conducting well- designed studies, and that addressing study design through the imple- mentation of a reporting checklist is a “very convoluted” approach. How- ever, she was impressed by Swaminathan’s data that showed researchers were continuing to use the reporting checklists in their ongoing work, suggesting that checklists do have an educational aspect. Macleod said institutions can assess how their research measures up against a checklist, then work to improve in areas that are deficient, and reward the investi- gators who contribute to that improvement with promotion and tenure. Steven Goodman said the checklists discussed are not user friendly. He supported the concept of prioritizing key items and proposed pilot testing checklists to determine what might be useful for end users. Another problem, as illustrated by dual evaluation of the MDAR checklist discussed by Macleod, is the lack of agreement by checklist PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 69 assessors on the compliance of a manuscript. Swaminathan emphasized the need for better coordination of concepts and language across the dif- ferent stakeholders and different stages of the research process. She noted that a goal of the Minimal Standards Working Group was to establish a minimum standard that would be applicable across the research life cycle. Macleod noted that there have been efforts to coordinate the language between MDAR and the Animal Research: Reporting of In Vivo Experi- ments (ARRIVE) guidelines so that relevant sections are interoperable, and that tools are in development to automate assessment of checklists. He added that disagreement among assessors and peer reviewers on whether a paper meets a particular standard is often related more to the assessor’s lack of understanding of the concepts (e.g., the unit of assess- ment, biological versus technical replicates) than the language used in the checklist instrument. Swaminathan agreed and said in implementing the Nature Research checklist, for example, they found that authors conflated the experimental unit with the number of times an experiment had been replicated, presented aggregate data from multiple experiments as if from a single experiment, and confused technical and biological replicates. The extent to which this occurred varied by field, more so in fields “that are inherently qualitative and descriptive, but that as science has evolved, have been forced into a quantitative mold.” Assessment and Accountability Yarimar Carrasquillo observed that the discussions have focused on training for students and early career investigators, and checklists for reporting studies, and suggested that institutions and funding agen- cies need to also implement checkpoints between those stages. Just because training requirements have been met does not mean investiga- tors continue to practice the principles. For example, institutions and funding agencies could assess and report whether trainees are actually conducting experiments that incorporate the four items recommended by Landis. Silberberg said the NIH peer-review process now requires investigators to discuss the rigor of the prior research they are citing as key support of their application. As researchers often cite their own prior work, this necessitates that they acknowledge the shortcomings. He reiterated that research is cyclical, and researchers will come to understand that it is to their advantage to conduct rigorous studies that they can then cite in their next grant application. Carrasquillo empha- sized the need for accountability, and proposed a quantitative approach, with the results taken into account in funding renewal and promotion evaluations. For example, investigators could be required to report what percentage of a laboratory’s studies included blinding, inclusion–exclu- PREPUBLICATION COPY—Uncorrected Proofs

70 ENHANCING SCIENTIFIC REPRODUCIBILITY sion criteria, randomization, and sample size calculation, and could note the reason why an element might not have been done for a particular study. Silberberg noted the limitations of a quantitative approach, for example, not all experiments can or should be blinded, and the quanti- tative aspect is lost once investigators can offer an explanation for each study that does not comply. He added that different standards would need to apply to exploratory versus hypothesis testing or confirmatory studies. Coller pointed to the importance of mentorship and the need to take into account “the subtle distinctions” that do not fit into a cell on a spreadsheet, but that are important in the evaluation process. Goodman proposed evaluating scientists and institutions at both ends of the per- formance spectrum, “not just on their best research, but by their worst.” In other words, he elaborated, it would be important to acknowledge when an investigator’s “worst” research is still of high quality. If there were some quantification, it would be understood that some research would likely fall at the bottom of the quality scale. However, a few high-impact studies would not balance an overall portfolio of “ignor- able” work, he said. Macleod said the EQIPD project is developing a quality manage- ment system that will allow laboratories to self-evaluate, implement mea- sures to improve performance (e.g., designate a quality improvement champion, develop a strategy), and self-assert that their performance is in compliance with the requirements of the scheme. A laboratory will then have a badge as evidence of their performance level, which can be used when applying for grants, submitting manuscripts, or recruiting, for example. The system will be open source and will link supporting resources, including templates. Training in Systematic Review Goodman suggested that trainees and young investigators need to be empowered with the skills to conduct methodologic meta-research. Journal clubs, he said, can be an opportunity to identify methods that could be systematically reviewed in depth. This would give students an opportunity to potentially publish a paper, but it also allows them to contribute to improving methodology in their own field, he said. Students become “sensitized to the weaknesses” in the literature in their field and are empowered and motivated to contribute to change. Macleod said the doctoral program in neuroscience at the University of Edinburgh is doing this by having small groups of students conduct a systematic review of the models they will use in their laboratory research. He shared an example of a student who had then applied this in her work. PREPUBLICATION COPY—Uncorrected Proofs

CHECKLISTS AND GUIDELINES 71 A participant from the NIH Division of Biomedical Research Work- force said that changes are forthcoming in 2020 that are designed to ensure that NIH training grants include resources for training in rigor, reproducibility, and data science. Reporting Metadata Anne Plant, a fellow at the National Institute of Standards and Tech- nology, emphasized the importance of recording and reporting metadata to help deal with the uncertainty around the data. “A measurement con- sists of two things,” she said, “a value … and the uncertainty around that value.” There is uncertainty in each step of the research process, and while steps can be taken to reduce uncertainty, it is never eliminated. Report- ing the metadata as well as the meta-analyses done, and actions taken to reduce the uncertainty in the data that are collected and reported, allows researchers to “know what is known,” and with what level of confidence. Coller pointed out using an electronic notebook provides version control and audit trails and suggested that notebooks could be made available as supplementary material to a publication. Plant said researchers need a tool that would allow them to easily capture and collate all of the meta- data around their protocol in real time, not after the fact. Including Other Stakeholders Coller prompted participants to consider other stakeholders that should be included in the discussions of transparency and reproducibility. He asked whether a checklist might help those who report on scientific advances to be “better informed about how they write about science,” or whether a checklist could help the general public better judge the quality and understand the uncertainties of the many studies in the news. The Press Macleod mentioned that the UK government inquiries into the prac- tices of the British Press included inquiries into press coverage of scientific issues. Reports about what causes or cures a disease one day often con- tradict what was reported the previous day. During the inquiry, Macleod said, it was found that the content of news articles was often taken directly from press releases issued by research institutions. While there are issues to be addressed regarding press coverage of scientific information, he said much of the responsibility for what is reported in the press lies directly with research institutions. PREPUBLICATION COPY—Uncorrected Proofs

72 ENHANCING SCIENTIFIC REPRODUCIBILITY Biomedical Research Investors Macleod suggested that another stakeholder group in need of quality information about biomedical research is investors. “Those that invest in our pharmaceutical industry are completely uninformed, unaware, and unconcerned about the quality of the biomedical research endeavor,” he said. Referring to his earlier example of NXY-059, which was effective in animal models but failed in a large clinical trial, he said that the manu- facturer’s share price fell by 17 percent, a value of $9.6 billion, over the 2 days after publication of the study results, and it took 7 years to recover. The Pharmaceutical Industry Silberberg said another stakeholder is the pharmaceutical industry. He noted that the EQIPD project is a good example of how academia and industry can work together to share data, resources, and expertise to advance product development. Public Health Gardenier identified public health as another community with a stake in the quality of biomedical research. Caregivers, community hospital groups, nursing homes, and others in public health administer the ben- efits of biomedical research to the public. Evaluating Quality Initiatives Macleod stressed the importance of evaluating research quality ini- tiatives to show that an intervention is achieving the intended outcome. Depending on the type of intervention, a manufacturing process con- trol chart could be used to monitor change, or a randomized controlled trial might be needed to determine benefit. A challenge, he said, is that vocabulary and methodology do not yet exist for this type of “research on research.” He noted the need to proceed cautiously and in a scientific way when “demanding our colleagues and peers change their practice”—he suggested developing the science and methodology and collecting evi- dence of the impact of interventions to effect lasting change to research practice. PREPUBLICATION COPY—Uncorrected Proofs

Next: 6 Toward Minimal Reporting Standards for Preclinical Biomedical Research »
Enhancing Scientific Reproducibility in Biomedical Research Through Transparent Reporting: Proceedings of a Workshop Get This Book
×
Buy Paperback | $50.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Sharing knowledge is what drives scientific progress - each new advance or innovation in biomedical research builds on previous observations. However, for experimental findings to be broadly accepted as credible by the scientific community, they must be verified by other researchers. An essential step is for researchers to report their findings in a manner that is understandable to others in the scientific community and provide sufficient information for others to validate the original results and build on them. In recent years, concern has been growing over a number of studies that have failed to replicate previous results and evidence from larger meta-analyses, which have pointed to the lack of reproducibility in biomedical research.

On September 25 and 26, 2019, the National Academies of Science, Engineering, and Medicine hosted a public workshop in Washington, DC, to discuss the current state of transparency in the reporting of preclinical biomedical research and to explore opportunities for harmonizing reporting guidelines across journals and funding agencies. Convened jointly by the Forum on Drug Discovery, Development, and Translation; the Forum on Neuroscience and Nervous System Disorders; the National Cancer Policy Forum; and the Roundtable on Genomics and Precision Health, the workshop primarily focused on transparent reporting in preclinical research, but also considered lessons learned and best practices from clinical research reporting. This publication summarizes the presentation and discussion of the workshop.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!