The second workshop panel focused on the state of the science on trust from three perspectives: an individual’s level of trust with respect to people, with respect to data and sources, and with respect to the output of machines and models. Moderator Fran Moore, CENTRA Technology, Inc., commented on the importance of this discussion of trust, noting that her years of experience in the Intelligence Community (IC) have taught her that useful and actionable information from intelligence analysis requires both collaboration among analysts and access to reliable data and sources. Therefore, she said, building trust in every aspect of the intelligence analysis process is critical to generating the kind of information policy makers need in order to make decisions of national security significance.
David Dunning, University of Michigan, discussed the dynamics of trust, with emphasis on what is already known and what future research is needed. He defined trust as “making yourself vulnerable with the prospect of potentially getting [some] gain back,” and pointed out that trust is necessary for many kinds of relationships and interactions—personal, organizational, and societal.
According to Dunning, research has found that, despite repeated findings that people can be cynical about others, people actually do trust each other. He provided the example of a behavioral economics experiment to illustrate this point. In a game scenario, Player 1 is given $5. Player 1 can either keep the $5 or give it to Player 2, a stranger with whom Player 1 has
been paired. If Player 1 gives the $5 to Player 2, the money will increase to $20, and Player 2 will have to decide whether to keep the entire $20 or return $10 to Player 1. Dunning presented a traditional economic hypothesis: Player 1 should not give the money because there is no assurance any money will be returned. Player 2, in addition, should keep the $20 because there is no incentive to give money back. Yet Dunning reported that this game has been played nearly 4,000 times in various settings, and each time, some percentage of Players 1 and 2 do give up the money. He added that when players in some studies based on this experiment were asked what percentage of people they believed would return $10 to Player 1, the average answer was approximately 45 percent, whereas the experimental data showed this percentage to be 80 percent.1 More interesting, he said, a full majority of Player 1s gave their $5 to Player 2s even though on average they felt more likely not to get any money back.
In another analysis by the same researchers, Dunning continued, risk tolerances were used to predict how many players would trust others. He pointed out that risk tolerance can be measured and can vary among people. The gamble presented in the game is hypothesized to be within the range of risk tolerance for some but not others. Research, he said, has shown that some people are unduly cynical about their peers, while at the other end of the spectrum are some people who are very optimistic about their peers. He explained that when measures of risk tolerance and optimism toward others are used to assess the propensity of people to trust, the findings predict that approximately 30 percent of people should trust other players in the game described above. Once again, however, the experimental data revealed higher percentages of people who actually do trust other players—generally between 50 and 70 percent.
Dunning drew three conclusions from the results of these experiments. First, trust does not follow intuition. Intuitional predictions about trust behavior made by either researchers or research subjects themselves often turn out to be incorrect. Second, behavior does not follow cognition. There are many research participants who do not expect to be rewarded for their trust (cognition), yet who trust anyway by giving away the $5 (behavior). Third, trust involves both a risky component and a less well understood “riskless” component, which cannot be predicted by the risk tolerance of participants.
Dunning introduced another key point about trust—that it is open to social pressures and influences that are incompletely understood. He used the example of the World Value Survey,2 which asks participants whether they consider most people to be trustworthy. According to Dunning, not
1 Fetchenhauer, D., and Dunning, D. (2009). Do people trust too much or too little? Journal of Economic Psychology, 30, 263–276.
only do people tend to trust others, but the level of trust seen across individual countries is closely correlated with a country’s level of economic development. However, he continued, the definition of “most people” varies dramatically across countries. People in some countries, including Thailand, Morocco, and China, he elaborated, have a small radius of others whom they consider trustworthy, primarily family and neighbors, while those in other countries, including Switzerland, Italy, Australia, and Sweden, have a much larger trust radius, encompassing people outside their country, those of other religions, and complete strangers.3 Dunning noted that social and behavioral scientists are only beginning to understand the dynamics at play in these differences.
Dunning then highlighted four research directions that could be followed to further understanding of the dynamics of trust:
- Trust across different social entities: Dunning stated that the dynamics of trust decisions are dependent on the types of entities involved (individuals, groups, or cultures), as well as on the relationship between the entities (friend, spouse, stranger, or enemy). He believes it would be interesting to explore how trust is produced or reduced in each of these situations.
- Trust across different issues: Dunning pointed out that most of the research on trust comes from economic models, where the issue of trust involves the exchange of material goods or money. He suggested that the dynamics of trust may be very different if the currency is something besides money—information a person provides, for example. He pointed to the many open questions surrounding the dynamics of trust in information, including whether an individual’s trust in the data provided by another is more dependent on the data provider’s ethics or his/her expertise.
- Trust across different communication modes: Dunning observed that methods of communicating remotely are a relatively recent societal development and have provided new ways for people to interact with each other. He pointed out that it is unknown whether people exhibit differences in trustworthy behavior or in their level of trust in others when remote communication methods are employed.
- Preventing too much trust (gullibility): Dunning argued that, given the amount of misinformation that exists, research should also be concerned with helping individuals be skeptical. In his words, “We need to know how to trust the right people, but also how to
3 Delhey, J., Newton, K., and Welzel, C. (2011). How general is trust in “most people?” Solving the radius of trust problem. American Sociological Review, 76(5), 786–807.
guard ourselves against people who might have ill will toward us, or maybe are those rational actors who are only concerned about themselves.”
Dunning concluded by citing a New York Times article that emphasized the importance of trust as a component of relationships.4 The article, said Dunning, explored the factors that impelled patients to follow the recommendations of their doctors, reporting that the best predictor of compliance was the patient’s trust in the doctor. In Dunning’s words, “Trust might be something that we underrate simply because it’s everywhere, and it’s so everywhere that it’s somewhat invisible, perhaps a little bit like oxygen, but just as necessary.”
Roger Mayer, North Carolina State University, discussed the components of trust and trustworthiness, focusing on what is currently known about interpersonal trust and the direction he believes future research in this area should take. He began by pointing out that the word “trust” appears frequently in the news, but despite its common use, people define it in many different ways. He argued that standardizing how the term is defined is essential to studying the concept scientifically.
To clarify his working definition of trust, Mayer described a model proposed in 19955 that identifies and defines some of the commonly used constructs surrounding the idea of trust:
- Propensity to trust others: Mayer noted that a general willingness to trust other people is a propensity believed to remain stable within a person over time.
- Trust: Mayer explained that this construct, defined as willingness to be vulnerable to or to take a risk with a specific person (the trustee), is a behavioral intention, not an action.
- Risk taking in relationships: Mayer noted that this construct is defined as the risk-taking action, or actual trusting behavior, that occurs as a result of behavioral intention.
- Trustworthiness: Mayer stated that this construct comprises the set of characteristics of a trustee that makes the trustor willing to be
4 Khullar, D. (2018). Do you trust the medical profession? The New York Times, January 23. Available: https://www.nytimes.com/2018/01/23/upshot/do-you-trust-the-medical-profession.html [February 2018].
5 Mayer, R.C., Davis, J.H., and Schoorman, F.D. (1995). An integrative model of organizational trust. The Academy of Management Review, 20(3), 709–734.
vulnerable to the trustee. He added that three replicable characteristics make up trustworthiness: (1) ability—the trustee is perceived to have the skills to accomplish what is needed; (2) benevolence—the trustee is perceived to care about the welfare of the trustor; and (3) integrity—the trustee is perceived to follow consistently a set of values that the trustor finds acceptable. On this last point, Mayer noted that both the values themselves and the consistency in following them are important.
- Risk: Mayer explained that this construct denotes the amount of harm or gain that might occur as a result of trusting a trustee, which is distinct from the construct of the trustworthiness of the trustee.
Mayer then pointed to a large body of research showing that the three characteristics of trustworthiness outlined above—ability, benevolence, and integrity—contribute independently and significantly to trust in a specific party, along with the trustor’s intrinsic propensity to trust.6 Trust, in combination with the amount of risk present in a given situation, he explained, determines whether risk-taking behavior occurs. He added that, should the risk-taking behavior occur, it will inevitably result in outcomes that will contribute to a feedback loop as the trustor uses the outcomes to reevaluate the ability, benevolence, and integrity of the trustee. This loop, he said, along with the propensity of the trustor to trust others, particularly early on in the relationship, explains the evolution of trust over time. In explaining this model, Mayer made two additional points. First, constraining risk is different from increasing trust; and second, the entire trust process occurs within a context, which may also play a role in the development of trust.
Mayer outlined three important conclusions that emerged from research on trust using this model. First, trust affects the work performance of employees, including their propensity to help each other and their ability to focus attention on the work that needs to be accomplished. Second, trust between groups evolves as each group observes the trustworthiness of the other and reacts in kind. Third, trust in an organization’s leader affects the organization’s performance, including its financial performance. According to Mayer, the model of trust used to obtain these results was designed to work not only at the individual level but also at the group level. Evidence suggests, he added, that this model also applies to trust in the federal government, which, he suggested, is “pretty remarkable, given how amorphous the government is.”
6 Colquit, J.A., Scott, B.A., and LePine, J.A. (2007). Trust, trustworthiness, and trust propensity: A meta-analytic test of their unique relationships with risk taking and job performance. Journal of Applied Psychology, 92(4), 909–927.
Mayer then argued that, since trust is only one response to human risk, future research would be most effective if focused on situations characterized by high levels of risk or on volatile situations in which the level of risk is increasing. He described several specific areas that he believes should be addressed by future research on trust:
- Repairing and rebuilding damaged trust: Mayer noted that research suggests that the process of rebuilding broken trust may be different from that of establishing trust in a new relationship.7
- Trusting humans versus trusting computers: Mayer stressed the importance of understanding the cognitive and behavioral aspects of the choice to trust a machine or a human, especially when the two present conflicting information.
- Trust in and between governments: Inductive research has shown, Mayer stated, that perceptions of a government’s ability, benevolence, and integrity contribute to trust in that government.
- Trust in medical doctors and the medical system: Mayer emphasized the importance of this issue because “for normal citizens, this is one of the most high-risk, day-to-day decisions that they make—what doctors to go see and how much to trust those doctors—because that can literally be a life-and-death decision.”8
- Effects of culture on trust: Societies differ, noted Mayer, in the ways in which trust is gained and destroyed.
- Police–community trust: Mayer suggested that research should address not just how much a community trusts its police force but also how much the police force trusts the community. This line of research is important, he argued, because when police have less trust in the public, they perform their jobs less well, and the public is less safe.
- Interplay of religion, culture, and trust: Mayer stated that how trust in different religions plays out across cultures is a daunting, worldwide issue that needs to be addressed by research.
In closing, Mayer emphasized that it will be essential for future research on trust to be interdisciplinary and that the structures of universities allow for such collaboration between disciplines.
7 Kim, P.H., Dirks, K.T., and Cooper, C.D. (2009). The repair of trust: A dynamic bilateral perspective and multilevel conceptualization. Academy of Management Review, 34(3), 401–422.
8 Damodaran, A., Shulruf, B., and Jones, P. (2017). Trust and risk: A model for medical education. Medical Education, 51(9), 892–902. doi: 10.1111/medu.13339.
Adam Waytz, Northwestern University, focused on two fundamental issues: understanding why humans distrust machines, and identifying ways to optimize human–machine partnerships to maximize trust. He began by pointing out that the news is filled with stories about an “automated future” and that people are ambivalent about whether these new technologies can be trusted. He noted that many of these news stories are not based on empirical data and have been written primarily by researchers in the fields of technology and economics. Waytz argued that much more emphasis should be placed on understanding the psychological implications of an automated future.
Waytz then discussed several reasons why humans distrust machines. He first described a study using a well-known philosophical scenario called “the trolley problem.”9 In this scenario, it is known with 100 percent certainty that there is a runaway trolley coming down the tracks, and that it will kill five people on the tracks unless one person standing on a bridge over the tracks pushes another person off the bridge. Pushing that person off the bridge will stop the trolley, and although that person will die, the act will save the lives of the five people on the tracks. Waytz reported that one-quarter of people in the study said they would be willing to push a person off the bridge. But, he continued, the more interesting finding was that other people in the study did not trust those who were willing to commit this act. Although this was a study of human beings, he asserted, it also provides insight into why people do not trust technology: “Because technology really does a cost-benefit calculation, it compares one life versus five lives, it says five is greater than one, and it pushes.” Waytz believes that one reason people do not trust machines is that technology is too cost-benefit oriented, and incapable of implementing such moral rules as “do no harm.” Another reason humans distrust technology, according to Waytz, is simply because they do not understand it. He provided the example of a study based on an algorithm that sorts through all the jokes in the world and selects the funniest ones for the listener.10 People in this study were asked whether they preferred to hear jokes selected by the algorithm or by a human, and the majority chose the human, presumably because they did not trust the algorithm to select funny jokes. However, Waytz continued, people in a separate study were presented with jokes chosen by both the algorithm and a human, and it was found that the algorithm was much bet-
9 Everett, J.A., Pizarro, D.A., and Crockett, M.J. (2016). Inference of trustworthiness from intuitive moral judgments. Journal of Experimental Psychology: General, 145(6), 772–787.
10 Yeomans, M., Shah, A.K., Mullainathan, S., and Kleinberg, J. (in review). Making sense of recommendations. Management Science. Available: https://scholar.harvard.edu/sendhil/publications/making-sense-recommendations [April 2018].
ter at selecting funny jokes. Waytz concluded that the participants did not trust the algorithm simply because they did not understand how it worked.
Waytz then expanded on human distrust of algorithms. He cited research demonstrating that people will choose to place bets on human forecasters over algorithmic forecasters even when they see the algorithmic forecasters outperforming humans.11 He explained that people do so partly because of their mistaken belief that algorithms are not capable of learning. He pointed out that algorithms are, in fact, capable of learning—possibly more capable than humans—but people have difficulty believing that.
Waytz also described some of his own work indicating that people are particularly unlikely to trust machines to perform work that is emotional or social. He used the example of a robotic surgeon, explaining, “people are okay with the robotic surgeon, but not if we say the job of the surgeon is to be compassionate, as well.”
With regard to mitigating people’s distrust of machines, Waytz had four suggestions. First, related to the above-cited work of Yeomans and colleagues, he noted that if people understand how machine learning works, they will have more trust in algorithms. Second, extrapolating from the work of Dietvorst and colleagues referenced above, he postulated that if people are allowed to modify an algorithm, their trust in the ability of the algorithm to make good forecasts will increase. Third, he suggested, since research has shown that people have more distrust in an algorithm when they have expertise in the particular area addressed by that algorithm, reducing overconfidence may increase trust in algorithms. Finally, Waytz noted that the work by his own group suggests that designing machines with such emotional features as a name, an emotional voice, a gender, or human-like facial features could increase people’s willingness to trust machines to perform emotional tasks.
Waytz also identified three ways to consider optimal human–machine partnerships: (1) humans implement moral rules about fairness and equality; (2) humans identify the outliers as machines compute; and (3) machines help reduce the emotional labor. To illustrate the first of these ways, he gave two examples of how humans intervened to reduce bias introduced by algorithms. His first example involved Google Translate. A few years ago, the Turkish phrase “bir doctor” translated to “he is a doctor.” Google engineers recognized this gender bias and changed the translation to “a doctor.” Waytz’s second example concerned Amazon Prime, which used an algorithm to select the best zip codes in which to offer Prime memberships. The algorithm resulted in Prime memberships not being offered in neigh-
11 Dietvorst, B.J., Simmons, J.P., and Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126.
borhoods that were predominantly African American. Amazon discovered this bias and changed the algorithm so that it no longer discriminated on the basis of race.
To illustrate the potential for humans’ identification of outliers to improve computations, Waytz described research from the Massachusetts Institute of Technology on a platform used to detect cyberattacks.12 With this platform, he said, the human analysts provided feedback to the computer regarding false positives, helping the machine learn and thereby decreasing the rate of false positives in the next round.
To explain machines’ reduction of emotional labor, Waytz provided the example of customer service representatives, who bear a high emotional burden from constantly dealing with frustrated customers. He described how several organizations have implemented voice-authentication biometrics to route calls properly, reducing customers’ frustration and the associated emotional burden on representatives.
In closing, Waytz suggested three questions to be addressed by future research into human–machine interactions. First, how does automation affect trust in other humans? For example, preliminary data indicate that when people are concerned about automation, they become more negative toward immigrants, relating automation and immigration with respect to their perceived effect on jobs. Second, does use of technology increase or decrease empathy? And finally, can robots be effective corporate whistle-blowers?
Victoria Stodden, University of Illinois at Urbana-Champaign, addressed trust in the computational aspects of research findings. She began by pointing out that for the last several years, an important trust-related discussion about the reproducibility of research results has been taking place in the academic community.
Stodden remarked that, along with the issue of reproducibility, several other factors can engender distrust of scientific findings and the scientific process, including excessive use of scientific jargon, the impression that discoveries do not hold over time, and the specific narratives used by the scientific community to relay research findings to the public. She stressed
12 Veeramachanei, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., and Li, K. (2016). AI2: Training a big data machine to defend. 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS) 49–54. Available: https://people.csail.mit.edu/kalyan/AI2_Paper.pdf [April 2018].
that all of these factors demonstrate the importance of communication in establishing trust in research findings.
Focusing on reproducibility, Stodden explained that it has three components that are quite distinct from each other: empirical, statistical, and computational. Empirical reproducibility, she said, refers to the context of traditional noncomputational experiments. She described the efforts of two collaborating research groups to ensure that the cell output signatures from their labs were indistinguishable. It took the groups 2 years to discover the cause of the different signatures of the output from the biological pipeline in each lab.13 Stodden defined statistical reproducibility as the proper application of statistical methods and models to data. She noted that a misapplication of statistics undermines the reliability of the inferences drawn. Finally, Stodden explained that computational reproducibility refers to transparency in computations, that is, transparency in the data, code, workflows, and other information that describe or implement the computational steps used in the discovery pipeline.
Stodden then elaborated on computational reproducibility, explaining that extraordinary complexity can be introduced when computational methods are used to draw inferences from data. To trust the scientific findings derived with computational methods, she said, people need to understand the methods, including their computational implementation. She suggested that heavily computational techniques may call for new branches of the scientific method,14,15 and she stated that computational reproducibility and the resulting trust in computational data will not be firmly established “until we have standards for dissemination, transparency, and verifiability for the computational results, as we do for deductive and empirical research [the first two branches of the scientific method].”
Stodden then outlined several goals for research on issues of trust, verifiability, and transparency surrounding the computational aspects of research findings. She clarified that these ideas do not apply just to research using large amounts of data from social media but also to any research with a computational element. Pertinent information should be readily available to other researchers, she argued, “so that the whole inference pipeline is actually made clear.” Research data, software, workflows, and computa-
13 Hines, W.C., Su, Y., Kuhn, I., Polyak, K., and Bissell, M.J. (2014). Sorting out the FACS: A devil in the details. Cell Reports, 6(5), 779–781.
14 The traditional two branches of the scientific method, Stodden explained, are deductive (mathematics and formal logic) and empirical (statistical analysis of controlled experiments). The potential new branches she cited are computational, involving large-scale simulations and data-driven computational science.
15 Donoho, D.L., Maleki, A., and Montanari, A. (2009). Message passing algorithms for complex sensing. Proceedings of the National Academy of Sciences of the United States of America, 106(45), 18914–18919.
tional details can be stored and shared through open trusted repositories, she elaborated. She also asserted that identifiers for data, code, and digital artifacts in published articles should be permanent so the information that supports scientific claims will continue to be discoverable over time. She suggested further that if citations of original data sources became standard practice, researchers’ data collection efforts could be rewarded, and a culture could develop “where there are greater incentives for sharing artifacts that are associated with trust and verifiability.” Additionally, she argued that digital scholarly artifacts should be adequately documented, although she acknowledged that this poses a challenge, particularly in terms of establishing and disseminating clear documentation rules. Finally, she suggested that journals should conduct reproducibility checks as part of the publication process, stating that since journals are the “gateway to a scholarly record,” they can play an important role in verifying digital artifacts.16
Stodden closed by emphasizing that implementation of these ideas for enhancing reproducibility represents a difficult and ongoing task, and she argued that funding agencies should support research in these areas to facilitate implementation.
Following the presentations summarized above, panelists participated in a discussion and responded to audience questions. One topic of the discussion was the public availability of research datasets. Jeremy Wolfe, Harvard Medical School, asked Stodden whether journals should actually try to reproduce computational results. Stodden replied that some journals already do so, and she suggested the need to develop infrastructure that would enable the verification process to occur automatically, drawing on research data and code located in public repositories. Sallie Keller, Virginia Polytechnic Institute and State University, noted that such infrastructure could also be useful for the IC. Mayer raised a major concern with respect to the public availability of research data: that others could easily access and publish data that other groups had put tremendous effort into collecting, effectively disincentivizing researchers from conducting large, laborious data collections. Stodden acknowledged the validity of Mayer’s point, stating her belief that this issue will persist until the expectation to cite data is part of the reward structure “because we do have to have recognition of work embedded in what we are doing.”
16 For more information, see Stodden, V., McNutt, M., Bailey, D.H., Deelman, E., Gil, Y., Hanson, B., Heroux, M.A., Ioannidis, J.P.A., and Taufer, M. (2016). Enhancing reproducibility for computational methods. Science, 354(6217), 1240–1241. Available: science.sciencemag. org/content/354/6317/1240 [April 2018].
Another topic of discussion was trust, with emphasis on its complexity. Peter Pirolli, Institute for Human & Machine Cognition, asked about the important perceptual and cognitive mechanisms involved in trust. The panelists agreed that many mechanisms are indeed involved, and the way in which these various mechanisms interact to determine whether a person trusts someone or something is an area of active research. Mayer stressed that the three components of trustworthiness he had discussed (ability, benevolence, and integrity) contribute in an important way to trust, and perceptual and cognitive mechanisms would be involved in determining one’s trustworthiness.
Continuing on the topic of trust, Thomas Fingar, Stanford University, asked how various elements of trust (trust between individuals due to membership in the same organization; trust due to reputation; trust due to professional ethos; and trust in the quality, transparency, and objectivity of data) could be integrated, prioritized, and sequenced to improve trust in the IC and the quality of intelligence support. The panelists voiced a number of ideas in response, including the importance of first relying on the default propensity to trust, leaving room for gut reactions and judgment calls, and educating people on the benefits of trusting others.
Wolfe asked for the panelists’ thoughts on the types of work on trust that may turn out to be transformational in a decade. Ideas suggested in response included research on enhancing trust across intergroup boundaries; enhancing public trust in such institutions as banks, the media, the government, and the IC; and increasing trust in such experts as scientists, doctors, and lawyers. Dunning argued that future research should address gullibility, or the issue of distinguishing valid from invalid expertise. Stodden emphasized the need for greater transparency in computational results, suggesting the establishment of a computable scholarly record with which users could independently verify research conclusions. Mayer mentioned the importance of understanding the basis for how people come to trust new technology, such as self-driving cars.
To close the discussion period, Keller asked the panel to clarify the research process that could be used to enhance or to rebuild trust in organizations. Dunning stressed the need to communicate using terminology appropriate to the audience. Waytz suggested studying the way people and organizations recover from breaches of trust and restore their reputations. And Mayer highlighted the methodological approach of first determining how people make attributions and how those attributions affect the malleability of their trust in others—starting with building theory, then conducting laboratory studies, and finally carrying out field studies.