Research misconduct, irreproducible results, retractions, conflicts of interest, and other issues surrounding the scientific process can erode public trust in science and scientists. To what extent do these violations of research integrity threaten the reputation of science? How can they be distinguished from the self-correcting nature of science, which tries to replicate and falsify existing findings in the pursuit of new knowledge?
Furthermore, what are the best ways to describe and explain these issues so that people better understand the context and norms of science?
The dominant narrative in the media’s coverage of science is the quest for discovery, observed Kathleen Hall Jamieson, Elizabeth Ware Packard Professor of Communication at the University of Pennsylvania’s Annenberg School for Communication. This narrative is not about problems, crises, or science being self-correcting. However, the self-correcting processes of science may be fueling a new narrative that science is broken, Jamieson observed.
Analysis of science news articles from USA Today, The Wall Street Journal, The New York Times, and The Washington Post from May 2016 to April 2017 revealed that the articles tend to contain words like breakthrough, advance, path breaking, and paradigm shifting. Most included a discovery or finding, credited the involved scientists or institutions, and noted the significance of the finding. Few mentioned past failures or false starts. The result, said Jamieson, is a mistaken impression that science is a linear process that goes from intuition to discovery without intervening difficulties.
A potential problem with the dominant news narrative is that not covering false starts or tentative conclusions that later change may give people the impression that difficulties are an aberration in the scientific process. The Economist of October 18, 2013, headlined such an article: “Trouble at the lab: Scientists like to think of science as self-correcting. To an alarming degree it is not.” Readers of such an article can easily miss the fact that scientists found those problems, disclosed them, and were in the process of correcting them, said Jamieson.
In a study of the “science is broken” narrative, Jamieson and her colleagues conducted a search of news databases for the term “science” near any of the terms “crisis,” “broken,” “problem,” “self-correction,” “retraction,” “replication,” “peer review,” “scandal,” and “fraud/fake” from April 2012 to April 2017. They found 121 articles and opinion pieces, which, after duplicates and unrelated stories were eliminated, yielded 76 articles for analysis. Of these articles, 31 focused solely on a new scientific finding about a problem in science, 26 were authored by a scientist, 22 noted that science is self-correcting or made a comparable statement and recommended at least one solution, and 5 indicated that the problem is exaggerated or not real. The overall conclusion was that the articles identified scientists as the ones uncovering problems but largely failed to note that they are also the ones moving to solve them. In short, scientists are playing a role in perpetuating the “science is broken” narrative, Jamieson said.
The “science is broken” narrative hinges on the issue of trust, observed Susan Fiske, Eugene Higgins Professor of Psychology and Public Affairs at Princeton University, and trust is complicated to maintain. Trust is crucial and adaptive, but it is also relative and fragile. Trust improves joint outcomes, increases effectiveness and accuracy in judging risks, and helps society progress. Societies with higher baseline levels of trust “are more economically successful because you trust that a stranger is going to live up to the contract,” said Fiske.
For scientists, establishing trust is crucial to communicating credibility, said Fiske. This credibility rests on both expertise, which people believe that scientists have, and trustworthiness, which depends on perceptions of a person’s motivation to be truthful. These perceptions are more of a gut response, Fiske noted, as opposed to evaluating expertise.
To the extent that people see scientists as being on one side of an issue in a polarized climate, trustworthiness will be a consideration. In that case, people are going to ask, for example, Are you on my side or not? Are you a friend or a foe? “If you’re on my side and you share my goals, then you’re warm and trustworthy,” Fiske said. People make this decision rapidly and do so in similar ways over different cultures and across time.
Expertise (or competence) and warmth are orthogonal in that they vary independently and together form a two-dimensional space that makes intuitive sense (Fiske and Dupree, 2014). For example, nurses, teachers, and doctors are seen as being high in competence and warmth, whereas laborers, customer service agents, and fast food workers are seen as being low in competence and warmth. Scientists and engineers are seen as highly competent but only moderately warm. This moderate score on the warmth scale, Fiske suggested, is because people do not know whether scientists are trustworthy or not. If scientists are viewed as elites, they could be seen as exploitative. “The downside risk for scientists, if we don’t get this right, is that people resent elites and dehumanize elites as unfeeling machines,” Fiske said. “That would be a bad place for us to end up.”
For now, public confidence in science remains high and stable, according to polls. “But compared with whom?” Fiske asked. “We are better than lawyers and CEOs—and better than Congress—but worse than doctors, teachers, and nurses, whose motivation, people think, is to be in their professions to help other people. We’re about the same as accountants and the military.”
Fiske and her colleagues have taken a different approach to gauge the perceptual trustworthiness of scientists. They asked people what attributes spontaneously came to mind when they thought about scientists. The most frequently cited attributes were smart, intelligent, educated, and curious, followed by such terms as thoughtful, organized, rational, nerd,
male, and disciplined (Nicholas and Fiske, in preparation). But almost none of the terms involved trustworthiness and warmth. “This suggests that we’re coming out in the middle of the trustworthiness-warmth scale because it’s absent,” Fiske pointed out.
Perceptions of trust are more easily lost than perceptions of competence, Fiske emphasized. “Think about a close relationship. If one person betrays the other person, it’s really hard to come back from that.” Furthermore, when described as competent, people may be passing judgment on the trustworthiness of sciences, in the same way that describing someone as “nice” may cast innuendos on their competence.
“Our reputation is at stake,” Fiske concluded. Remaining trustworthy requires consistency over time in safeguarding perceptions of reliability and trustworthiness. Paraphrasing Jamieson, Fiske noted that “when there’s no narrative, you’d better find one, because otherwise somebody else might find it.”
Kevin Finneran, editor-in-chief of Issues in Science and Technology, agreed that “trustworthiness is something we have to be responsible for ourselves. We have to look at all the things we can do within the research process and the operation of science to make our work as reliable and believable as possible.”
In 1992 the National Academy of Sciences (NAS), the National Academy of Engineering, and the Institute of Medicine published the report Ensuring the Integrity of the Scientific Research Process, which focused on threats to the research process from fabrication, falsification, or plagiarism, collectively termed “scientific misconduct” by the report (NAS et al., 1992). A quarter century later, the National Academies of Sciences, Engineering, and Medicine published Fostering Integrity in Research, which revisited many of the issues covered by the previous report (NASEM, 2017c). Fostering Integrity in Research noted that the research enterprise has changed markedly in the quarter century since the earlier report. It is larger, more complex, more regulated, more dependent on information technology, more oriented toward commercialization, more international, and more relevant to policy decisions. All of these changes make maintaining the integrity of research more challenging, noted Finneran. Collaborators from other countries may have different standards of review and responsibility. Researchers from different disciplines may have different ways of listing authors or making data available. “There are a variety of things that we have to stay on top of and negotiate,” he said.
Both reports noted that reliable data about the extent of scientific misconduct are difficult to gather. The number of misconduct cases identified
by the National Science Foundation and the National Institutes of Health (NIH) has remained small. However, in a survey of research psychologists, between one-quarter and one-half admitted engaging in practices such as failing to report all of a study’s conditions, selectively reporting only studies that “worked,” or reporting an unexpected finding as having been predicted from the start (John et al., 2012). Another survey of researchers with funding from NIH found that 10 percent admitted practices such as inappropriately assigning authorship credit or withholding details of methodology or results in papers or proposals (Martinson et al., 2005). A meta-analysis of researcher surveys concluded that 2 percent admitted to falsifying or fabricating data at least once and 14 percent were aware of a colleague doing so (Fanelli, 2009).
Of the three transgressions identified as scientific misconduct, plagiarism seems to be declining as the use of software to detect copied text increases, Finneran noted. However, the growing number of author-pay open-access journals could increase the risk of plagiarism, because many of these journals probably are not running software to identify plagiarism.
The number of retractions has increased sharply in the past decade, and an analysis by Fang et al. (2012) of articles on PubMed found that two-thirds of retractions are due to misconduct. But retraction rates are far from a perfect measure of misconduct, Finneran cautioned. Retraction numbers are very low, and retraction has become a common practice only recently. On the other hand, many fraudulent papers are not retracted, and other proxies for misconduct have also been going up in recent years.
Reproducibility, or the inability to reproduce research results, has also been receiving more attention. For example, Ioannidis (2005) found that systematic biases led to false-positive findings in more than half of published studies, and an effort to replicate 100 psychology studies found that the mean effect size of the replications was about half of what was reported in the original articles (OSC, 2015). However, not all research is reproducible, Finneran observed. Some conditions, such as natural phenomena, cannot be replicated. Clinical trials involve individual patients, and a new cohort will differ in unspecifiable ways. When a researcher fails to reproduce someone’s work, the problem could be with the reproducer rather than with the original research.
The bottom line, said Finneran, is that scientists still do not know how much misconduct is taking place. “But the important message of all of this is that we’re not as good as we would like to be. We need to work harder.”
Fostering Integrity in Research concluded that more attention needs to be focused on what it called detrimental research practices, or deviations from best practices that increase the likelihood of error and vulnerability to misconduct. These are what Finneran called “less-than-ideal, easy ways of doing science that’s not quite reliable enough,” including
- Detrimental authorship practices, such as honorary authorship, demanding authorship in return for access to previously collected data or materials, or denying authorship to those who deserve it;
- Not retaining or making available data, code, or other information/materials underlying research results as specified in institutional or sponsor policies, or standard practices in the field;
- Neglectful or exploitative supervision in research;
- Misleading statistical analysis that falls short of falsification;
- Inadequate institutional policies, procedures, or capacity to foster research integrity and address research misconduct allegations, and deficient implementation of policies and procedures; and
- Abusive or irresponsible publication practices by journal editors and peer reviewers.
The research community needs to be aware of these practices and address them, said Finneran. It also needs to communicate the fact that it is taking such steps to enhance its trustworthiness.
When Marcia McNutt, the 22nd president of the NAS, was editor-in-chief of Science, one of her crusades, she said, was to improve practices within scientific publishing to work against violations of the norms of research and to maintain trust in science. A workshop organized by the leaders of Science, Nature, and NIH brought together the editors of top journals, the heads of major universities, and researchers to discuss reproducibility in preclinical research (McNutt, 2014). Recommendations emerging from that meeting were codified in a check sheet and have been adopted by more than 500 other journals. For example, the recommendations call for preexperimental plans for collecting and handling data, with sample size estimations to ensure appropriate signal-to-noise ratios, randomization in the treatment of samples, blind conduct of experiments, and transparency in reporting. “If one of the samples was removed and you decided, ‘I’m not going to analyze this sample’ and you have a good reason, you have to say so. ‘I removed this sample because we found out something about this sample that was a legitimate reason to take it out of the analysis.’”
Based on the success of that workshop, a second workshop supported by the Laura and John Arnold Foundation focused on transparency as a way to address the reproducibility issue in science. That workshop led to a paper proposing what came to be known as the Transparency and Openness Promotion (TOP) guidelines, which were designed to support norms of sharing data, facilitating review and replication, and reducing
bias (Nosek et al., 2015). More than 5,000 journals and organizations have agreed to the standards, making them “probably the most widely propagated standards on the planet for increasing transparency and increasing replication possibilities for science,” said McNutt. The standards cover eight issues:
- Citation standards,
- Data standards,
- Analytic methods (code) transparency,
- Research materials transparency,
- Design and analysis transparency,
- Preregistration of studies,
- Preregistration of analysis plans, and
For each standard, journals can sign on at different levels. With the data standard, for example, the first level may simply require a journal to disclose whether data are available or, if they are not, why not. The next level may require data to be available if an article is to be published. The next level may require that the journal verify that the data have been deposited in an appropriate repository and check to see if they can be accessed. In some cases, journals may be required to access the data and ensure that they support the conclusions in the paper.
Individual disciplines may have more detailed standards. For example, a workshop on reproducibility in computational methods established the following goals (Stodden et al., 2016):
- Share data, software, workflows, and details of the computational environment in open repositories;
- Publish persistent links with a permanent identifier for data, code, and digital artifacts upon which the results depend;
- Cite shared digital scholarly objects;
- Document digital scholarly artifacts; and
- Use open licensing when publishing digital scholarly objects.
In the same issue of Science that contained the TOP guidelines, Alberts et al. (2015) recommended increased transparency and openness in research, the provision of more incentives for reviewing, greater recognition of excellence in reviewing, the establishment of at least two classifications of retractions to distinguish honest mistakes from fraud, and improved language in conflict-of-interest declarations. Most of these recommendations are now in the process of being instituted, McNutt reported.
Finally, McNutt et al. (2018) called for journals to adopt common and transparent standards for authorship, outline responsibilities for corresponding authors, adopt the CRediT (Contributor Roles Taxonomy) methodology for attributing contributions, and encourage authors to use the digital persistent identifier ORCID. It also recommended that institutions have regular conversations about the criteria for earning authorship on a paper.
These recommendations support those in Fostering Integrity in Research, which recommended that authors disclose their roles and contributions and that institutions establish policies that will thwart detrimental research practices. In addition, Fostering Integrity in Research urged that a Research Integrity Advisory Board be created to serve as an organizational focus for best practices and standards.