Page 57 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

8

Identification and Mitigation of Bias in Human-AI Teams

HUMAN BIASES

Decision bias, in the current context, is a preference toward certain information or options that is considered to be “irrational.” Bias is created through systematic error introduced by selecting or encouraging one outcome or answer over others. In decision sciences, the concept of bias is related to the concept of rationality, the principle of maximization in agreement with subjective expected utility theory (Einhorn and Hogarth, 1981). The value of subjective expected utility theory as a principle of human decision making has been criticized from its inception. Notably, early critique by Simon (1955, 1957) described humans as approximately rational (i.e., boundedly rational) rather than rational. A substantial body of research on human heuristics and biases followed, which has gained significant popularity over the past several decades (Kahneman, Slovic, and Tversky, 1982; Tversky and Kahneman, 1974). This research has resulted in increased understanding of several well-known human decision biases, including anchoring, confirmation bias, framing effects, and availability. Since then, the list of human biases has grown. Most commonly, these biases describe gaps in human decision making compared to rational decisions, rather than explaining how humans actually make decisions (Gonzalez, 2017; Klein, 1993; Lipshitz, 1987). It is often assumed that the introduction of AI will reduce or eliminate human decision bias, however, this has not yet been shown to be the case in complex real-world settings. While there are important evolutionary reasons for many of these human biases, most notably their ability to reduce cognitive load and allow rapid decision making, these same benefits are not necessarily applicable to AI systems that do not suffer from the same significant attention or processing limitations as humans.

AI BIASES

AI also suffers from biases, which occur when a computer algorithm makes prejudiced decisions based on limited training data (West, Whittaker, and Crawford, 2019). AI bias can also result from certain features of the algorithm. The most common form of AI bias results when the data used to train an AI algorithm carries systematic deviations from a norm (e.g., fairness), which can result from inherent frequencies of examples in AI training sets or a lack of representativeness of the data. For example, algorithms that carry flaws based on the data they are trained on can lead to serious discrimination in the selection of job candidates, or in police actions based on race (Daugherty and Wilson, 2018). In the committee’s judgment, these biases may often be hidden.

Page 58 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

Humans can introduce multiple sources of subjectivity and bias into the design of human-AI teams (Cummings and Li, 2021b), which include (1) bias from inappropriate data curation; (2) bias in the design of one or more algorithms; and (3) bias in the interpretation of the resulting algorithms. Regarding data curation, it is well established that bias can be inadvertently introduced into an AI system due to underlying data sample selection bias (Gianfrancesco et al., 2018; Samimi, Mohammadian, and Kawamura, 2010). However, there is substantially less research on how the actual curation of the dataset affects outcome, and it is still not well understood how problems in data labeling affect algorithm brittleness. For example, inherent subjectivity in emotion labeling can make any resulting models suspect (Cowie et al., 2011).

Errors made in actual data labeling, either by humans or machine-based labeling systems, are even more problematic. One study, looking at 10 commonly used computer vision, natural language, and audio datasets, found a 3.4 percent average error rate across all datasets (Northcutt, Athalye, and Mueller, 2021). Data-labeling errors affect overall classification outcomes and can be pervasive in commercial language models and computer vision systems, which can form elements of systems used by the DOD. Ongoing research seeks to identify and correct bias in machine-learning (ML) datasets (Lee, Resnick, and Barton, 2019), but significantly more work is needed in this area.

In addition to data curation, significant bias can be introduced into an AI system when the designer subjectively selects an AI algorithm and the associated parameters for an application. One recent study illustrated that there were at least 10 significant subjective decisions made by designers of ML algorithms that could impact the overall quality of the algorithms (Cummings and Li, 2021b). The committee finds that there are currently no standards or accepted practices for how such points of bias and subjectivity could or should be evaluated or mitigated.

The third major source of bias is that generating results from ML-based probabilistic models requires interpreting complex statistics, which is a known and well-documented point of weakness, even for experts (Tversky and Kahneman, 1974). Research efforts have recently attempted to make outputs more explainable (Chandler, 2020) or interpretable through sensitivity analyses like counterfactual explanations (Fernández-Loría, Provost, and Han, 2020). However, most of these efforts attempt to explain or improve interpretability for experts and developers of these algorithms, and significantly less effort is aimed at helping users of AI systems to understand the results of such systems (see Chapter 5).

Further, bias may result when the training data is not representative of the situations in which the AI system will be applied. For example, if an AI system is trained on situations found in one environment, it will be biased in its recommendations when applied to a different type of environment. An AI system trained on the military tactics of one adversary would do poorly when directed at a different adversary because it is biased toward its training data. In this sense, bias can be thought of as resulting from over-generalization of an AI system beyond what was represented in its training.

The committee finds that the importance and impact of AI bias cannot be understated, especially for users of time-pressured systems—a hallmark of military systems. Users may be completely unaware that there are potentially flawed assumptions and biases that could call into question the results presented by AI systems, and users also do not typically have a way to understand the practical confidence intervals of AI-based recommendations. This challenge is also noteworthy because it impacts certification efforts. For example, if external system evaluators (non-creators) cannot understand how systems are developing solutions and executing operations or their possible failure modes, those evaluators cannot develop appropriate confidence that the AI systems can meet the specified requirements.

Given the increased use of AI in many societal applications, including policing, legal decision making, social benefits, hiring, and others, the committee finds that the interdependencies between human and AI bias are a major concern. In particular, in multi-domain operations (MDO), the impact of AI biases may be large and significant, given the variety of new and novel situations that may be encountered.

Page 59 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

HUMAN-AI TEAM BIAS

Although it is often assumed that humans can oversee an AI system and correct its errors, providing independent checks on the system, this has been shown to be untrue; human decision making can be directly affected by the accuracy of the AI system, creating a human-AI team bias. Kibbe and McDowell (1995) found that when image analysts were provided with recommendations from an automated target recognition system, this rarely resulted in improved performance over either the human or the automated system working alone. Similarly, Metzger and Parasuraman (2005) found that air traffic controllers performed better on their own than with an imperfect conflict-detection system. Further, when AI systems are wrong, their human partners are much more likely (30–60%) to make errors than when they receive no advice from the AI system (Layton, Smith, and McCoy, 1994; Olson and Sarter, 1999; Sarter and Schroeder, 2001). This has also been called concept drift (Widmer and Kubat, 1996). Similarly, when automation is used to cue important information in a visual scene, users are more likely to choose a cued target, even if that target is incorrect, and to miss uncued targets (Yeh and Wickens, 2001; Yeh, Wickens, and Seagull, 1999). Selcon (1990) also showed that, when the confidence levels associated with multiple options considered by an AI system are similar, human decision making is significantly slowed.

This body of research shows that people will often anchor on the recommendation of the AI system, and then gather information to agree or disagree with it. The AI system therefore provides direct input into human decision making, increasing the risk of human error when the system is wrong, akin to confirmation bias. The time taken for the human to make that assessment can, in some cases, be significant. Rather than operating in a parallel fashion with AI systems, as independent decision makers (which would increase system reliability), humans actually operate in a serial manner with AI systems, taking their inputs into account along with data gathered independently, reducing reliability and overall human-AI team performance (Endsley and Jones, 2012).

Further, the impact of AI on human performance can depend on the way an AI system’s recommendations are presented. For example, Endsley and Kiris (1994) examined methods of presenting an AI system’s confidence in recommendations, including digital percentages, analog bars, rankings, and categories such as high, medium, and low. They found that performance was not significantly improved by AI advice, even for novices, and decision time increased for most methods of presentation. Decision times were slightly faster when categorical presentation was used by the AI system compared to when no information was provided by the system. More recently, Friesen et al. (2021) compared alternative advisory system displays for safe-path planning in a helicopter flight route planning application. They found that, when the advisory system generated a specific flight path, pilots tended to follow it even when there were better trajectories available that would save fuel and time. In contrast, when the advisory system used constraint-based displays showing multiple path options, pilots were more likely to select an optimal route. Framing effects have also been noted (Banbury et al., 1998). While these examples show the importance of system transparency (see Chapter 5), they also demonstrate the subtleties involved in combining human and AI system decision processes. The number of decision options generated, the agent (human or AI) generating the decision options (Endsley and Kaber, 1999), the order of exchange of decisions (human first or AI first) (Layton, Smith, and McCoy, 1994), and the format or framing of decisions have all been found to have a significant effect on decision quality. These effects can be quite insidious, as they may not be apparent to either the decision maker or the system designer.

Thus, human biases present in the selection and development of AI training datasets or in the development of AI algorithms can create AI biases. AI biases can lead to human decision-making biases when the AI system is incorrect or uncertain, and decision-making biases can negatively affect human performance. The committee finds that the interactive effects of bias in the human-AI team may often be subtle, occurring below conscious awareness, but can lead to poor decision outcomes with potential ill effects, such as increased collateral damage, fratricide, or damage from adversarial attacks. Further, human-AI teams may be subject to common team-based biases, such as information pooling or group think, that could negatively affect performance. While it is logical that people need to gather information to check the output of an AI system, the lack of independence of human and AI decision processes means that people may be inadequate at performing this important cross-check function, and it demonstrates the interdependent effects that biases can create.

Page 60 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

KEY CHALLENGES AND RESEARCH GAPS

The committee finds five key research gaps that exist with respect to the potential for both human and AI biases that can negatively affect performance. More information is needed in the following areas:

Improved understanding of the interdependencies between human and AI biases;
Examination of the potential for adversarial attacks on human and AI biases, and detection and mitigation of these effects;
Determination of human-AI biases that emerge from AI learning based on small and sparse datasets;
Development of adaptive and personalized AI models that can predict human biases and respond appropriately; and
Preventative detection of emergent human and AI biases within the context of online, continuously evolving learning systems.

RESEARCH NEEDS

The committee recommends addressing five related research objectives to reduce bias in human-AI teams.

Research Objective 8-1: Human-AI Partnerships in Continuous-Learning Environments.

AI and human biases can feed into each other. This interconnectedness of heterogeneous and autonomous AI systems with humans who continuously learn and adapt their behaviors can generate emergent behaviors that are difficult to predict and may result in catastrophic effects (Ramchurn, Stein, and Jennings, 2021). Research is needed to determine the effects of AI biases on human biases, and of human biases on AI biases, to ensure that human-AI interdependencies are understood and their outcomes ethical, appropriate, and safe. Research is also needed to determine how human-AI interdependencies will evolve with continuous interactions, so that biases can be prevented and situation awareness of teammates’ biases can be determined (see Chapter 4). It would be advantageous for research to determine the appropriate regulations of and accountability for these interdependencies in human-AI partnerships.

The interdependence of biases between humans and AI systems needs to be studied in cooperative as well as adversarial settings. Open AI and explanations that help humans identify AI anomalies need to be investigated. Very little work exists to address conflicts within human-AI relationships, in particular in team settings (Lin and Kraus, 2010). For example, how much control should be given to a human to mediate the detection of AI biases when the human can also be biased?

Research Objective 8-2: Adversarial Effects on Human-AI Team Biases.

In multi-domain operations, human-AI team biases can develop within adversarial situations. For example, in the context of cybersecurity teams (Buchler et al., 2018), many human biases can be identified (Cranford et al., 2021; Gutzwiller et al., 2018) in which humans have difficulty detecting the intentions of an attacker. Cyber criminals often use human biases to conduct phishing attacks and get credentials to access an organization’s systems, for example (Rajivan and Gonzalez, 2018; Singh et al., 2019). Furthermore, cyber criminals may also attack via AI biases. Adversarial machine learning research has identified many weaknesses of AI algorithms that can be easily exploited by an adversary (Harding et al., 2018). New research needs to investigate potential biases in multi-domain operations and the weaknesses these biases represent for defense. Research is greatly needed for cyber defense, to prevent enemies from gaining advantage via human-AI biases, and to determine how defenders can exploit such biases for cyber defense (Gonzalez et al., 2020). Machine-learning and AI-bias research is needed to prevent attacks to AI systems that occur by taking advantage of AI biases. Multi-domain operations research would be well served to adopt an adaptive approach to overcome biases, as in recent advancements of adaptive cyber-defense methods (Gonzalez et al., 2020; Marriott et al., 2021).

Research Objective 8-3: Biases from Small Datasets and Sparse Data.

In many human-AI teams, important decision making often resides with the human, while information gathering and analysis is the job of the automation (Blaha et al., 2019; Tambe, 2011; see Gonzalez et al. (2014) for a discussion of decision making in cybersecurity

Page 61 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

teams). However, automation (specifically AI and machine learning (ML)) and similar data-driven technologies can be significantly affected by the quantity and quality of the data the systems are trained on (Ramchurn, Stein, and Jennings, 2021). Appropriately representative datasets may be limited in many multi-domain operations applications.

Work on adversarial ML demonstrates a major weakness of ML algorithms: they are prone to simple visual perturbations by an adversary (Goodfellow, Shlens, and Szegedy, 2015; Papernot et al., 2016). While ML algorithms are created to shield the human from the overwhelming amount of data he or she would fail to successfully process, humans can easily overcome this security weakness of ML algorithms by visual identification (Harding et al., 2018). This creates an ironic situation in human-AI teams: the AI system that is created to strengthen security may actually weaken it. Because ML relies on data, the performance of ML systems depends on the realism and correctness of data, and how the systems are maintained. In the committee’s judgment, much research exists on methods to deal with small datasets and sparse data, but the bias problem emerging from these systems has not been addressed. Research is required to address resultant human-AI biases that emerge from AI learning based on small and sparse datasets.

Research Objective 8-4: Inductive and Emerging Human Biases.

A number of consistent deviations from rational behavior have been identified using laboratory experiments with simple prospects described in terms of probabilities and outcomes (Kahneman and Tversky, 1979). However, currently, the large collection of human cognitive biases cannot all be explained by one comprehensive theory and, most importantly, it is unknown how biases develop over time or how they initially emerge (Gonzalez, 2017). As a result, little is known about how to overcome human biases.

Recently, a large amount of work has been dedicated to the development of models and approaches to determine human decisions (Erev et al., 2010; Gonzalez and Dutt, 2011). These models have been extended to involve team and group work (Gonzalez et al., 2015), but it is unclear how these models would generalize to the particularities of human-AI team interdependencies. To effectively capture human-AI biases in multi-domain operations, AI algorithms must be aware of the human’s preferences and constraints (Ramchum, Stein, and Jennings, 2021). Furthermore, it would be useful for such models to be able to trace human preferences and biases dynamically and to be able to customize and personalize AI responses according to the predicted levels of biases. Such an adaptive and personalized approach is being investigated in the context of cybersecurity (Cranford et al., 2020; Ferguson-Walter, Fugate, and Wang, 2020; Gonzalez et al., 2020; Gutzwiller et al., 2018). The investigation of adaptive and personalized models that can predict human biases needs to be extended to other aspects of multi-domain operations.

Research Objective 8-5: Preventative Detection and Mitigation of Human-AI Team Biases in Learning Systems.

Preventative detection of emergent human and AI biases needs to be studied within the context of continuously evolving learning systems. Identification and detection of AI biases are often difficult before a system is deployed (Slack et al., 2020). Current techniques are limited to explaining biases after they have emerged rather than preventing AI biases from emerging in the first place (Gilpin et al., 2018; Ramchurn, Stein, and Jennings, 2021). More research is needed to detect, prevent, and/or mitigate potential AI biases before an AI system is deployed, and research is also needed to test AI systems against attempted adversarial exploitation. Further, methods are needed to discover, measure, and test bias in human-AI teams. It is unknown how human decision biases affect data curation, how this can be evaluated, and what can be done to mitigate such biases. It is also unclear how to overcome implicit human biases, given the limited research on the emergence of such biases. Cognitive models of learning can help identify and prevent human biases (Cranford et al., 2020, 2021), but more research targeting the identification and prevention of human biases is required. It is important to build on research on anti-fragility teams to determine how individual biases influence team biases (Taleb, 2012).

SUMMARY

Humans are subject to several well-known biases that can negatively affect their decision making. AI systems, far from being perfect, are also subject to a number of biases that may be hidden from the people who interact

Page 62 Cite

Suggested Citation:"8 Identification and Mitigation of Bias in Human-AI Teams." National Academies of Sciences, Engineering, and Medicine. 2022. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: The National Academies Press. doi: 10.17226/26355.

×

with them, and which can negatively affect an AI system’s performance and the performance of the combined human-AI team. Research is needed to better understand the interdependencies between human and AI biases, and to detect and prevent biases that impede effective performance in human-AI teams in multi-domain operations, particularly in the face of adversarial actions that may try to exploit them.