Enhancing Team Performance
The enhancement of one person's performance can be viewed in a variety of ways, as is done in the other chapters of this book. In many situations, however—in sports, the military, the workplace, to name but a few—people do not perform their tasks alone; rather, they do so in conjunction with other people performing parallel, similar, or complementary tasks. The performance of such an aggregate of individual performers is not merely the sum of the individual efforts, but some more complex combination, systems that must be studied on their own. This chapter presents what is known about those aggregates—usually called groups, teams, or crews —and how to affect their performance. (Throughout this chapter, we use the terms group, team, and crew interchangeably.)
RESEARCH ON GROUPS: IN THE LABORATORY AND ON THE JOB
The examination of how teams and groups perform and accomplish their goals and tasks has been studied by a variety of investigators from a number of disciplines. Two of the more prominent traditions of study have been those of experimental social psychology and of human factors.
Social psychology has dealt with groups primarily as theoretical entities. It has sought to elucidate rather general propositions about how groups function from the abstract manipulation of variables, often in experimental laboratory settings. In recent years social psychologists
have focused on decision making, and the prototype question has been the study of how groups arrive at a consensus. The major model has been a jury, which is an unorganized group whose members possess similar information and who must arrive at a consensus. The variables most often investigated are group size, rules of procedure, and other structural features.
The field of human factors has dealt primarily with teams or crews, that is, the types of groups found in the workplace. The studies are often addressed to the task of optimizing performance in a particular setting or set of circumstances. The crews and teams under investigation are often studied on the job, and quite commonly, the teams are stratified and functionally differentiated. Such studies often focus on the division of labor among people working in a particular equipment design environment, as, for example, the layout of instruments and controls in an airplane cockpit occupied by several people.
These two traditions tend to parallel an important distinction in group research: the difference between laboratory and field research. Studies in the laboratory, favored by social psychologists, permit researchers to gain a greater amount of control over the factors under investigation. They allow random assignment of people to different experimental conditions and also permit the standardization, minimization, or the measurement of factors not currently under investigation but capable of spuriously affecting outcomes. But because laboratory studies examine “hot-house” variables under usually idealized circumstances, it is often difficult to apply them in any particular case.
Field studies, in contrast, examine groups in their natural environments. While some control of the factors affecting a crew is possible, field studies often must contend with many complex factors. Because randomization of team members is not usually an option, many of the factors are covaried (i.e., related to each other) and often not even identified. Field studies often give good answers about specific questions concerning the composition and work of particular teams, but because of the specific circumstances under which they are conducted, they are difficult to cumulate into general principles for group performance.
It seems clear that the perspective of both traditions—experimental social psychology and human factors—are necessary for addressing the problems faced by those who must design efficient and comprehensive training programs for activities involving group tasks. General principles and applicable specifics are both needed. This chapter attempts such an integration by examining two aspects of the missions of groups or teams: decision making and performance. Both of these aspects can be studied in the laboratory and in the field, but they pose different sets of problems for researchers.
Although the two processes of decision making and performance are intertwined in the functioning of a group, they are often at least conceptually separable. The decision mechanism of a team may be fixed, as in a military unit, but its performance mechanism still needs to be implemented. For example, a football team may have a mechanism for deciding which play to call. The coach or the quarterback decides: if the coach decides, he signals the quarterback, who informs the team in the huddle. The performance, how well the team runs the play, is still problematic. Alternatively, a team or group may be able to coordinate its efforts, but not have a mechanism for selecting and deciding on its strategy and tactics. It is not known what is the best training regimen for a group that is faced with both decisional and performance requirements; there is a strong possibility that the training procedures that are optimal to teach groups to make decisions are the same procedures that are optimal for producing enhanced performance.
With these distinctions of decision making and performance in mind, we now turn to an examination of what is known of the differences between group and individual performances. The object is to ascertain if there are recognized general procedures that can improve the task-oriented performance of work groups. The chapter is divided into three major sections. Following a brief overview of issues of team performance, we review selected studies designed to evaluate the performance of teams in laboratory settings. A third section highlights some problems involved in moving from the experimental laboratory to field situations in which teams perform under pressure: three case studies are presented and the questions for training and research suggested by each of them.
INDIVIDUALS AND TEAMS
The primary functions of many organizations are performed not just by individuals acting alone, but by small teams of two to perhaps a dozen individuals. Members of such task-oriented teams are sometimes highly coordinated (e.g., tank crews, unit staffs) and sometimes largely unorganized (e.g., trouble-shooting groups, research teams), the skill requirements of a group can range from simple perceptual-motor skills to abstract reasoning, and various combinations occur. In short, teams are as diverse as the tasks that confront them and the environments in which they operate. Problems of team performance and the means of solving those problems thus vary considerably, and they are specific to the task and work environment. Nonetheless, it is useful to note some very general issues and problems associated with team performance and team-oriented research, many of which have no precise counterparts in
individual performance. The general focus is on interaction processes, organizational structures, and operating procedures that foster and sustain optimal task performance (or fail to do so) and the basis for training designed to maximize such performance.
In principle, a team appears to offer greater resources and processing power than an individual, but teams also involve interpersonal coordination and management problems that do not exist with individuals. Evaluating the tradeoff between these two resembles a cost-benefit analysis, although it is easy to overextend the analogy.
Individual performance can be evaluated according to a variety of standards, depending on the task. For example, performance can be axiomatically correct or optimal (an answer to an arithmetic problem is correct or a mile is run faster than ever before), empirically correct or workable (a piece of a jigsaw puzzle fits or an automobile runs), and so on. At the other end of a spectrum, intuition or the consensus of experts may be the only available standards by which a performance can be judged (the literary worth of a book).
Team performances can be evaluated by similar standards, but in addition, questions of interpersonal management costs and coordination efficiency have to be considered. In other words, baseline against which to judge team performance must necessarily take into account the number of people involved, the time devoted to team effort, and the consumption of other resources relative to the level of performance achieved.
A highly compelling intuition from conventional wisdom, which also arises repeatedly in the research literature, is that small, task-oriented groups are, in general, “good.” Such intuitions probably arise from the simple observation that some tasks can only be performed by groups (e.g., lifting a large log), unlike some tasks that can be performed either by an individual or a group (solving an arithmetic problem or recalling text). Moreover, when it is feasible to compare groups and individuals directly, (e.g., solving a word puzzle with a unique solution), the former is on the average rarely inferior (e.g., see Davis, 1969; McGrath, 1984; Brown, 1988).
Other perspectives are possible, however, and lead to different conclusions. For example, team performance has often been observed to fall short of a reasonable theoretical baseline, relative to the task demands and resources invested (e.g., Steiner, 1972). An example of a baseline is individual performance which assumes that a group will do at least as well as individuals performing the same task. Indeed, such team baseline comparisons are a major tool for evaluating team performance.
Although a shortfall may not be truly universal (i.e., not characteristic of all work environments of interest), team performance decrements have been observed in such a very wide variety of task domains and
performance environments that it seems prudent to regard suboptimal team performance as the norm. The critical ingredient in detection has been development of a theoretical baseline relevant to the particular task environment. To some extent, the recognition of various team performance decrements resembles the discovery of decision-making biases in individuals, apparently due to faulty cognitive heuristics (e.g., Tversky and Kahneman, 1974, 1981; Kahneman and Tversky, 1982); these decision biases (e.g., the tendency to overestimate the probability of occurrence of familiar events) have largely been detected as individual performance departures from a reasonable theoretical baseline.
The available research, summarized in detail elsewhere, suggests that team performance is generally suboptimal (see Brown, 1988; Davis, 1969; Hastie, 1986; McGrath, 1984; Steiner, 1972). Before considering the remedies to be sought through the development of better operating procedures, interpersonal process management, and training, we consider briefly some examples of research from various task and performance environments.
Individual input can vary as a function of a variety of intra-individual variables, but here we are more concerned with extra-individual factors, especially those originating with other members of a team. The interpersonal processing of such input ultimately produces the team action: the aggregation, concatenation, combination, assembly, or other treatment of information (or other kinds of individual contributions) underlies the team output. We consider first individual input issues.
Audiences and coactors have been observed to influence strongly and systematically the quality and quantity of individual performances before them (see summaries by Geen, 1980, 1989; Zajonc, 1980; Borden, 1980). Even passive audiences can be responsible for arousal in a performer, which in turn generally facilitates performance of well-learned, routine behaviors and inhibits performance of poorly learned responses, including tasks that require processing of substantial information (Zajonc, 1965; Cohen and Davis, 1973). The latter finding does not encourage optimism for performance environments that require problem solving or the processing of information generally—especially those that might be necessary for coordinated team performance in emergencies and high-stress environments typical of the military, law enforcement, and many commercial settings.
If audiences actively respond, or can be interpreted as valuative in character, the audience-coaction effects just noted are generally exacerbated. Obviously, the nature of the task and setting, as well as such background factors as sex and culture of performers, are important in determining performance facilitation or inhibition. In somewhat more complex situations, the presence of others can be comforting, and interestingly enough, can even reduce anxiety and other reactions to stress, especially for subjects who are first-born and only children (e.g., Schachter, 1959; Wrightsman, 1959). Research on intergroup bargaining shows that the presence of constituents during bargaining leads female representatives to become increasingly cooperative through the course of the session (Druckman et al., 1972). Constituents' presence can either impede or facilitate negotiations depending on the culture of the bargaining representatives (Druckman et al., 1976). For example, negotiations over the distribution of resources between American representatives were prolonged while the bargaining between Argentinean representatives were hastened in the presence of an audience.
The literature cited above documents how the mere presence of others can influence the individual member performance where task requirements range from simple perceptual motor skills to abstract reasoning (although studies of the former are more common). Such effects run a time course, and may abate with training and experience. Unfortunately, relatively little research addresses these latter issues, although anecdotes from performers and others support the importance of adaptation through experience.
When several persons act simultaneously, a set of people must in some sense act together. For a wide range of simple tasks, various studies have demonstrated that members do not contribute proportionally to the team effort. First labeled “social loafing” by Latané et al. (1977), the per-member reduction in effort has been observed when team members' efforts are pooled, and the overall magnitude or quantity of output is what is apparent: rope pulling, evaluating proposals, pumping air, shouting, clapping, and the like (see Harkins and Szymanski, 1989, for a concise summary). It seems that more than the lack of member identifiability with input may be involved (e.g., Harkins and Jackson, 1985); social loafing generally appears to be some sort of motivational loss. Increasing member identifiability or providing a standard for the group to evaluate its own performance can erode the social loafing effect; see Kerr (1983), for a discussion of other group motivation loss effects.
Although increases in identifiability and enhancement of evaluation standards would apparently improve upon these motivation losses, ex-
actly how particular techniques would work over a protracted period of time remains unstudied.
Information Processing and Coordination
Most of the tasks in the audience and coaction environments that have been studied were fairly simple, and the appropriate theoretical performance baseline for evaluating team performance was fairly straightforward: team output was essentially a summation or similar aggregation of members' inputs. We now consider more complex interactive tasks that require or especially benefit from interpersonal exchanges of information (solution proposals, decision preferences, critiques, etc.). Such group tasks appear to be increasingly common in industrial settings, public institutions, and military organizations.
Early studies of group performance were generally referred to as group problem solving, whether or not the task was a problem to be solved or some other intellectual task facing the group (Davis, 1969). Most of these studies were concerned with individuals working “together ” or “apart,” a distinction of some practical importance in that early German educators were interested in the optimal allocation of homework between collective effort and solitary study (see Murphy and Murphy, 1931, for summary remarks). The early general finding was that, in comparison with individuals working alone, groups tended to solve more problems (word puzzles, arithmetic problems, and other tasks that emphasized abstract reasoning) and at a faster rate (e.g., Shaw, 1932). By the mid-1950s, however, a number of simple empirical demonstrations (e.g., Taylor, 1954; Marquart, 1955) made it clear that group efforts were not routinely superior to individual efforts: given the resources committed per unit time, they were in fact demonstrably inefficient in terms of an appropriate baseline (e.g., Taylor, 1954; Lorge and Solomon, 1955; Marquart, 1955). For example, imagine a population composed of solvers and nonsolvers for a problem that has a correct answer (e.g., a word puzzle). A randomly composed team contains a solver with probability p, and a nonsolver with probability (1 − p). Suppose that interaction is neither helpful nor deleterious (i.e., interaction does nothing), that members behave independently of each other, and that a team solves the puzzle if it contains at least one solver. The probability of a group solving it is then, 1 − (1 − p)r (where r is group size). It is this value (and analogous “predictions” for other tasks and environments), that ad hoc freely interacting teams do not exceed and generally fall well below (see summaries by Hastie, 1986; Davis, 1969). Some data suggest that even experienced groups organized for task performance fall below the
above, best-member baseline (e.g., Davis et al., 1971), but so little research has addressed groups organized and trained to the task that little can be said about actually engineering group efficiency in the various contexts in which teams must perform.
Not all suboptimal performance can be attributed to lowered input at the level of the individual member, and Steiner (1972), among others, has outlined the nature of the losses due to faulty interpersonal processes required for pooling or otherwise combining information and responses into a group product (see recent summaries by McGrath, 1984; Brown, 1988). Efforts to engineer improvements in group performance have typically emphasized procedural mechanisms that both promote discussion and efficient structuring of effort and enhance personal productivity (see summary by Hackman and Morris, 1975).
Brainstorming (Osborn, 1957) has been among the most popular of the widely publicized devices to further group productivity—especially teams engaged in tasks requiring “creative” problem solving. Brainstorming, which is essentially a set of guidelines for managing discussion, has proved highly popular with organizations as a procedural corrective and as a solution and idea-stimulating technique, despite critical research evaluations from shortly after its introduction. Early empirical evaluations found that, relative to a “best-member” baseline, brainstorming did not successfully enhance performance; rather, isolated individuals were relatively more productive (quality and quantity of ideas, solutions, etc.) (Taylor et al., 1958; Dunnette et al., 1963). However, brainstorming has survived and continues to be a popular means for organizations of various kinds to use to attempt to enhance performance of their task-oriented teams. Recent empirical studies have confirmed earlier research showing that brainstorming techniques, despite their continuing popularity, do not alter the suboptimal performance of problem-solving groups (Diehl and Stroebe, 1987; 1990; see also meta-analyses by Mullen and Johnson, 1991). After exploring a variety of procedural manipulations, Diehl and Stroebe attribute the productivity loss to “blocking”—an inability of a member to produce ideas while others are talking, a kind of distraction notion.
It is easy to understand why organizations pursue techniques for the enhancement of suboptimal performance. It is less easy to understand the continued popularity of brainstorming, which has been shown to have no demonstrable value.
Consensus Decision Making: Choice Shifts
Many teams exist essentially to reach a consensus on a choice among alternatives (events, things, ideas) or a judgment about an amount of
something. The team output is either a recommendation to another decision-making authority (e.g., a staff analysis or a personnel committee recommendation) or its consensus choice is itself decisive (e.g., a village council vote or a court verdict). The emphasis may be less on the interpersonal processing of information than on achieving agreement about the correct or optimal decision.
Conclusions about suboptimal performance (discussed above) are also applicable in team decision making, but it is worth briefly considering the implications of another research literature: team decisions under risk or uncertainty.
Conventional wisdom traditionally regarded decision making groups to be prudent, or, at least, cautious in the sense that a typical team decision was not likely to be extreme. Considerable surprise met the discovery that a set of group decisions tended to be more “risky” on the average than the same or comparable individual decisions (Stoner, 1961; Wallach et al., 1962). Subsequent research found that team decisions were often either a “risky shift” or a “cautious shift” (see summaries by Dion et al., 1970; Davis et al., 1991).1 The ensuing efforts to identify the basis of the counterintuitive “choice shift” phenomena focused on the nature of risk and social values, rather than the interpersonal consensus process. However, such shifts have been observed in tasks devoid of social content, let alone containing much social value. It can be shown that such simple and widespread consensus rules, e.g., plurality or majority wins, are sufficient to produce group-level shifts in decision distributions, given skewness at the individual, input level (see Davis et al., 1991).
Choice shifts can be a logical consequence of aggregating according to common team decision rules, and while there is no necessary remedy, it is clear that many problems come from adopting a single decision or recommendation for the record. The minority views—that may later be valuable—are lost, a reduction in information that could later prove to be serious, i.e., any tendency in the frequency distributions of individuals will be exaggerated in the distribution for groups.
The more general nature of choice shifts—increased extremity of decision in either direction—only increased their practical implications; team work does not protect against “extremism” and may even increase it. In particular, the tendency of ex-group members also to be more extreme (group polarization) in the direction of their team's decision, although less extreme than the group of which they had been a part, also hold implications for practice (see reviews by Myers and Lamm, 1976). However, as noted above, almost all research on this phenomenon addressed ad hoc groups whose members lacked mutual experience, operating procedures, or training that might serve as a corrective.
Major concerns of team consensus decision making, especially in organizational contexts, have been the balancing of biases and expertise as well as the interests of various constituencies, concerns that have no counterpart in individual performance. A good example of enhancement techniques applied to team decision making comes from group judgmental forecasting, although the following discussion could in principle apply to other team tasks as well (see Davis et al., 1991, for a more general summary). The forecasting of future events—weather, market share, economic conditions, enemy troop deployments, population change, and the like—can sometimes be accomplished moderately well with statistical models, given sufficient current knowledge. However, in the face of substantial ignorance but serious need, experts sometimes must make intuitive forecast judgments, often with a minimal empirical basis, and teams of forecasters may also extrapolate from current to future conditions. It is desirable not only for individuals to be as accurate as possible, but also for the team members to combine their preferences in such a way that the consensus forecast is as accurate as possible.
Among the several methods that have been proposed for engineering team performance, we consider two that have enjoyed some enduring popularity: the Delphi technique and the nominal group technique. We do not consider strictly mathematical or statistical means for aggregating forecasts (or other decisions), although such conceptual notions produce the theoretical baselines essential for evaluating actual behavioral or quasi-behavioral forecasts. Although much of this research has focused on ad hoc groups under laboratory conditions, Janis (1972, 1982) has analyzed a number of actual cases in which group decisions were sometimes successful and sometimes unsuccessful according to subsequent events. These case studies agree with laboratory experiments in particular respects, namely that faulty interpersonal processes (e.g., misdirected conformity pressures, cohesiveness) are largely responsible for suboptimal team decisions, despite the clear possibility of optimal actions.
In light of this well-known effect, both the Delphi and nominal group techniques constrain interaction with the aim of eliminating or moderating faulty interaction processes. Delphi (Dalkey, 1969, 1970) exercises elicit participants' judgments or opinions privately, summarizes these results while maintaining the anonymity of their source, and circulates the results to all team participants. Iterations of this procedure continue until individual positions stabilize (see Linstone and Turoff, 1975, for variations). This technique has enjoyed considerable popularity, especially in judgmental forecasting contexts. The nominal group technique also begins with the private elicitation of individual opinion, but members then meet for face-to-face exchanges of information; contributions are recorded and summarized by an outsider; members revise own opin-
ions in private; and contributions are combined mathematically to yield a team decision (see, e.g., Delbecq et al., 1975).
Despite the popularity of such techniques for enhancing team performance involving consensus decisions, surprisingly little valuative research has been carried out to determine their effectiveness. However, the evidence to date on team performance accuracy enhancement is not encouraging (e.g., Fischer, 1981; Rohrbaugh, 1979; Gough, 1975). McGrath (1984:75) concludes that “neither one does any better than the freely interacting groups, whose ‘deficient' processes these techniques were designed to improve.”
In summary, the attempts to enhance team performance by imposing procedural remedies for nonoptimal interpersonal processes on the path to consensus have met with negligible success. However, long-term training of teams has rarely been studied, and even ad hoc laboratory groups have not received the research and development attention that serious attempts to engineer enhancement techniques should receive. Although performance on many different tasks that commonly confront teams consistently falls below reasonable theoretical baselines, in principle performance increments could be established through better interpersonal procedures, including training over protracted periods of time.
Teams, of necessity, will continue to perform a variety of tasks within organizations of all kinds. Unfortunately, the research and development programs committed to team performance enhancement are infrequent. An intuitively attractive explanation for the modest investment in the engineering of optimal procedures and interpersonal processes, as well as the training of individual team members, is that conventional wisdom and common sense, on the face of it, seem to provide such reasonable guidance. Unfortunately, however, the research evidence to date contradicts much of that apparent wisdom.
TEAM PERFORMANCE: FROM LABORATORY TO FIELD
The review of what we know about group performance is more striking for what is missing than for what is known. Although the issues can be discussed in the abstract, a brief description of three actual training and performance regimens can help to illustrate some unaddressed issues and point to ways in which the knowledge base could be expanded to productively address questions that are central to improving the performances of real groups in their work situations. The first part of this section presents three examples—two military and one industrial setting: the Special Forces, the Army's Ranger training, and control rooms in nuclear power plants. The next parts consider the training and research questions raised by these examples.
Three Field Examples
The Special Forces branch of the Army uses a squad as its basic operating unit; the typical squad consists of 12 soldiers. Within each squad, two members are trained in each of six specialties, for example, medic, weapons, combat engineering. The squads are trained to work together as a unit, either functioning independently or in collaboration with other squads as part of a larger operation. As an example of the first case, a squad with minimal or no support facilities might be assigned as military advisers to the armed forces of an allied country. Special Forces squads differ in their particular missions: they may have a squad-wide special mission or capability, such as underwater specialists or airborne units. There are further gradations within mission specialties: some of the airborne squads are HALO —high altitude low opening—units.
HALO is a specific type of parachuting operation which differs from the mass low-altitude operations of the regular infantry airborne units. A HALO squad is expected to jump from a high-flying, and, hence, relatively quiet plane, free fall for some distance, and then open its parachutes at a low altitude. Practice is needed not only in the individual skills, but also in coordinating the squad members ' ability to land close to each other at the target.
A HALO squad has two distinct training needs: each of its members must be trained in his subspecialty, such as weapons, and the squad as a whole must practice in HALO team operations. In theory, these two training aspects, one splitting the squad into at least six subgroups and the other assembling the squad for joint training, could be scheduled so that the squad rotated between periods when all members were together with periods when all members were at their specialty training. In practice, however, this cannot be done because the individual specialties have different lengths of training, different needs for updating and re-certification, and attend training at different sites. Thus, there are several important issues raised by the special forces training needs.
First, what is the optimal balance between individual training and squad training? When does a medic forgo more medical training in order to practice HALO skills? Conversely, when is a HALO squad coordinated enough to allow its members to disperse for specialty training? How can the plan be adjusted for the different training needs of the different specialties?
Second, what is to be done when a squad is not complete? For instance, a squad may have to be deployed when some of its members
are absent for individual training. What are the best solutions to having incomplete squads: secondary training in other subspecialties for each member to serve as back-up for absent members or the assignment of specialists who are new to the squad to fill in for the missing soldiers?
Third, should squads be kept together throughout their term of service to induce better team coordination and performance or should component members be rotated through several squads to facilitate interchangeability and minimize the disruption of substitute members?
The Army has a special program of Ranger training for selected soldiers. Troops from all of the Army's units are eligible for the Ranger program on the basis of superior performance in their basic and general training and nomination from their supervising officers. Nominees are subjected to a rigorous series of selection procedures. If accepted into the Ranger program, the soldiers undergo several weeks of extensive training in a variety of habitats, such as mountains, swamps, deserts, and the like. The training procedures are physically gruelling and are characterized by extreme deprivation of sleep and food and extraordinary physical demands. Upon successful completion of the Ranger course, a soldier is given a Ranger patch and other identifying insignia. Although some Rangers are assigned to serve in all-Ranger battalions, most are rotated back to their original units in which they serve as ordinary soldiers, but are identified as role models by their Ranger accomplishments.
There is probably no question that the Rangers are a superior group of soldiers. But there is a major issue raised by the training and assignment procedures, that of formal and informal leadership within the unit. A squad with a noncommissioned officer (NCO) in charge and a Ranger among its members has two leaders, one explicit and one implicit. It is not known to what extent there is a possibility for conflict or discrepant messages from these two sources. What happens if the NCO in charge changes? Does the Ranger assume a bigger leadership role? Does the presence of a Ranger facilitate or hinder the ability of the new NCO to assume leadership?
Nuclear Reactor Control Rooms
Nuclear reactors, even within the same plant, differ from each other in fundamental design, by changes made because of different dates of manufacture, and by different solutions to basic engineering questions made by different manufacturers. Because of this variety, at most plants
there is a simulated control room for each reactor. The simulated control rooms duplicate the instrumentation of their particular reactors, and, through computer models, interact with the instrumentation to produce the effects on the gauges and information displays that would be shown in the actual control room of the modeled reactor.
The simulation control rooms are used for initial training of operators as well as for retraining, certification, and general assessment of operational skills. They are extremely complex: some of the later model reactors may have more than 10,000 individual displays and a similar number of input controls. Personnel differ for each reactor, but a typical control room is staffed by a group of four people— two operators, a foreman, and a shift supervisor.
At a training session a reactor crew will be placed in the simulation control room with a reactor simulation in progress. Trainers situated behind one-way glass simulate a malfunction of the reactor through appropriate manipulation of the simulation computer. The control room crew are monitored on their ability to recognize that a malfunction is occurring, to work through the requisite diagnostics, and to take the necessary corrective actions, shutting down the reactor if needed. Many of the procedures are algorithmic in nature: when a certain dial shows an abnormal reading, for example, the shift supervisor may locate a manual that has step-by-step rules for isolating the underlying fault. The foreman then reads each step as the two operators carry out the foreman's commands, reporting back information and outcomes as requested. The procedures are not rigid: the team must make a series of operational decisions as well come to agreement on the nature of the problem, its seriousness, and the steps needed to rectify the situation. Each member of the team has a different function, but the members are interdependent with respect to the larger problem. The shift supervisor's choice of diagnostic algorithm depends on information from the operators; the foreman‘s instructions to the operators depend on the shift supervisor's choice of diagnostic procedure manuals; the operators' choice of dial readings to report and of switches to set depend on the foreman's instructions. The control of a reactor involves periods of quiescence punctuated with anomalous events that may or may not represent serious malfunctions or emergencies. Successful operation of a reactor depends on the control room's crew to work its way to a correct set of decisions about what is wrong, if anything, and what needs to be done.
Observation of a control room simulation reveals some issues that are generic to team training, decision making, and performance. First, not all group decision making involves the type of consensus setting usually studied in free discussion groups such as juries. In a control room, the group's decision is arrived at only after a set of sequential decisions are
made by individuals, so that a consensual agreement at the end of the process, based on the “facts” then in hand, may be flawed by errors or biases introduced along the way. How does one train crews to monitor and provide quality control along the way?
Second, because the four members of the control room staff are interdependent, the result is a “weak-link model,” that is, the team performance will not be able to rise above that of the weakest member. Thus, two questions are clear: Is the crew aware of the performance weaknesses of its members? If aware, can the members compensate, monitor, or double-check that performance? In at least one simulation that was observed by the committee, the substandard performance of a shift supervisor, though detrimental to the crew's decisions, was not recognized by the crew because they had never worked with any other shift supervisor; they had no basis for knowing a good job by a shift supervisor.
Third, the trainers in control of the computers that simulate the reactor are able to manipulate representations of each electromechanical component of the reactor. Simulations of problems invariably involve programming a malfunction of a piece of hardware and observing the crew's reaction. Studies of actual control room mistakes, however, usually reveal that in large part, it is human reaction to electromechanical events or malfunctions that are at the heart of an “accident.” Yet human mistakes are rarely built into the simulations, so training is rarely designed to overcome them. What would happen if a foreman asked an operator to read dial A305, for instance, and the operator reported a reading from dial A306? How would this affect the foreman's application of the diagnostic algorithm? What is the sensitivity of the group decision process to such human error? Such issues are not limited to reactor control rooms but arise in most command-and-control structures.
Challenges to Training
These illustrations suggest a series of questions of both immediate concern to the organizations involved—the Special Forces, the Rangers, and nuclear power plant control rooms—and to many other military and industrial settings.
Should work groups be trained together as a unit or should members be trained separately and assembled into units? The advantages of maintaining groups through both training and performance are coordination and the elimination of the duplicative effort of retraining. However, separate training with subsequent assembly of a group minimizes disruption when a team member needs to be replaced and gives practice in adapting to changed groups.
In groups in which members have different tasks, skills, or functions, what is the optimal division of training for the members in terms of primary and secondary tasks? If a team member spends all of the training time acquiring greater facility on his or her primary task, the member does not get trained in a second specialty and cannot take on the task of a team member who is unavailable when the team must perform. Yet too much time spent acquiring secondary skills may distract from learning or maintaining the primary ones.
How should the time of team members be allocated between their training in individual specialties and team training when both types of training are necessary? The length of time for the specialty requiring the longest training defines the minimum amount of time for the entire team to train or practice together.
How should those responsible for making time and training allocation decisions be trained? At what operational level should staffing decisions be made? If a group loses the services of a member, someone must decide whether the group can still function, which of the members can fill in the missing specialty, when replacement members are required, and many other staffing questions. It is yet to be determined what training is required for the circumstances to be accurately evaluated and for well-reasoned decisions to be made.
How should groups whose members have different tasks and specialties to preserve combine information from the diverse members? Equally important, how are members to be trained to evaluate the performances of other specialists, that is, to recognize deficiencies in other parts of the team? When a group has some divergent opinions or strategies for accomplishing its mission, how are the divergent viewpoints and minority information to be retained and utilized? Finally, when group tasks are for situations that cannot be created in training, such as warfare or nuclear disaster, how can the use of simulation in training contribute to real-life effectiveness?
These questions are not a complete set of what would be developed from systematic observation of the Special Forces, Rangers, or nuclear power plant control rooms, and not all of these questions are new or unresearched. It would be difficult, however, to give specific answers to questions in the contexts of particular situations on the basis of existing research on groups. It should also be noted that the research answers to the questions depend on the products desired. Some of the answers to questions posed by Ranger training depend on specifying the priorities of the different goals of the training: for example, is Ranger training foremost a specialty skill or is it primarily a vehicle for building self-sufficiency and self-esteem? If research is designed to answer Ranger training questions, it may serve the useful function
of forcing a further specification of the desired outcome and purpose of the training.
Challenges to Research
Most of the questions from the three work examples are not covered by existing research. In the social psychological work, the experimental groups are most often not comparable to the ones that exist in the military and in industry. First, most working groups are faced with problems that are mixed in nature. They are neither pure decision making nor pure performance problems. The group's task is usually a variety of activities, involving some decision or consensus, some performance, some information gathering, and moving back and forth between these activities in the performance of their mission. The group's task could be broken down into smaller, typologically purer, subunits, in an effort to make research applicable to each component of the complex task, but then linking the subunits back together would immediately reinstate the problems the disassembly was designed to eliminate. Few, if any, social psychological studies use the complex problems that are present in any practical situation. It is difficult to predict how simple tasks combine in a group's attack on complex problems without research directed to that very issue.
Second, groups in social psychological research tend to be homogeneous. Although the early literature on group dynamics (cf., Festinger and Thibaut, 1951) examined some aspects of stratification, differences in rank, authority, prestige, and knowledge are generally missing variables in the research groups. For example, in an early line of work, Torrence (1954) studied B-26 combat crews in decision making situations. He compared existing crews consisting of an officer pilot, an officer navigator, and an enlisted gunner with crews consisting of the same personnel but who had never worked together. Torrence was able to show that rank, status, and previous experience influenced how the crews came to agreement. For example, when given problems with objective solutions, the gunners had difficulty getting their correct answers accepted by the officers, and the pilots were able to impose their incorrect answers on the other crew members. The distortions from optimal performance were greater in the permanent than in the temporary crews. Important as are these considerations for real-life team and group activities, investigation of factors such as these is now virtually nonexistent.
Third, most group studies are single-session experiments, or, at most, several-session experiments. These temporary groups, together for a short period of time, are hardly comparable to the extended, long-term
interactions of actual groups in the workplace. The reasons for this lack are understandable: the financial and logistic difficulties of conducting longitudinal research with experimental groups are so formidable as to preclude it from the consideration of most investigators.
If the community studying the social psychology of groups is not doing the research needed to answer the questions raised by the observations illustrated above, does the field of applied psychology and human factors do so? Our answer is a qualified “no.” Foushee's (1984) examination of small groups in the cockpits of airplanes is an excellent example of solutions to particular problems. Foushee's method was to intensively review hours of video tapes of groups in cockpits; it will undoubtedly provide improvements in procedures and benefit both crews and airlines using that cockpit configuration. Yet it is difficult to imagine how to generalize these findings to other tasks for which the specific items of job performance are very different and the size and nature of the group does not resemble those in the cockpits.
The tradeoffs between abstract, basic group research, generalizable in principle but of little help in specific situations, and applied, mission-oriented research, of use in a specified situation but not generalizable to other situations, has long been recognized (see, e.g., Mahoney and Druckman, 1975). One of the early explications of group research in social psychology—Festinger's (1953) description of laboratory research in social psychology—articulated the issues clearly. Unfortunately, little progress has been made in the almost four decades since these problems were presented. The problem is that neither stream of research by itself gives satisfactory solutions to the problems raised by some of the complex training needs of the military or of industry. If both approaches could focus on the same problem and combine their results, much could be gained.
There are several ways such an amalgam could be obtained. One way would be an attempt to generalize from a series of applied studies in particular settings. There is useful information in such reviews, but the usual problems of interpreting results when so much is going on at the same time make application of the generalizations difficult. Typically, such results work fairly well when applied to situations similar to those from which the data were gathered; they are not as applicable to somewhat different or to novel situations.
A better way would be to conduct parallel studies in field and laboratory settings. One could examine the abstract proposition in laboratory studies and simultaneously explore ecological validity of the propositions by testing them in naturalistic, field settings. Examples of this type of coordinated research are rare. One example is the study of some relationships between employee satisfaction and job performance during
periods of change in a work task. A field study at an appliance plant, conducted by Schachter et al. (1961), was paralleled by a laboratory test of the same hypotheses by Latané and Arrowood (1963). Another example is the research conducted by Hopmann and his colleagues on negotiation processes; results from analyses of simulated negotiation groups were compared with results obtained from content analyses of the transcripts of the actual negotiations (see Druckman and Hopmann, 1989, for a review). These studies provide a useful model for guiding future research on group processes.
Clearly, studies in the field, but with experimental controls—true field experiments—could answer many of the questions posed by the training needs of, for example, the Special Forces. If the studies could be done longitudinally, so much the better. Most of the current funding practices and incentive structures work against the chances that studies of this type will be undertaken. Human factors investigators must concentrate on the particular group being studied, at the expense of designing studies that speak to general issues. Social psychologists explore theoretical propositions in studies of temporally limited, ad hoc groups. There are few organizations or funding agencies that have supported the study of experimentally manipulated groups, over time, in natural settings, despite the general utility of such investigations.
If formal research does not seem to be of immediate help in solving the problems raised by the training and performances of real groups, are there informal research procedures that might be of help? Quite often, in the military or in industry, training procedures or task requirements are changed. Such modifications made in the training and adaptations of on-going groups contain information that might be useful for constructing generalizations about what works in group training. Lacking control groups, these natural quasi-experiments are not a substitute for rigorous research on the same issues, but they could provide important information about the performance effects of changes in training regimens. Unfortunately, even this information is lost: few procedural changes are ever made with a rigorous evaluation component built in, so that actual effects of the changes on performance can be determined.
Many problems and issues in team performance are potentially answerable by research, but research on group structure and function, once a mainstay in social psychology, is now not a major focus of the work. Moreover, most of these studies are of unstratified groups, making them less relevant for real-world problems involving stratified groups and command structures.
Research on group problem solving indicates that teams perform at suboptimal levels. While a number of techniques (e.g., the Delphi technique, nominal group technique) have been used to try to improve the group process and outcomes, only a few of these interventions have been evaluated systematically, and the available evidence does not provide clear support for the techniques.
The research on group processes has focused in recent years on decision making with an emphasis on jury-like studies. Some of this is directly relevant to group training, but it leaves unaddressed many of the group performance questions.
Case studies of actual groups have contributed a number of questions about optimum training procedures as well as hypotheses to be explored by systematic research, but the findings have proven difficult to generalize.
Some of the difficulties of experimentation on group performance are logistic. It is difficult to find a suitable number of comparable groups that remain stable over time to compare the effects of experimental differences on their training and performance. Large numbers of comparable groups do exist in the Army and other military services, and they provide the possible experimental subjects and conditions for effective study of team training, decision making, and performance. Since the military recruits young people who are comparable to the entering U.S. labor force, the results of such group studies may also have applicability to industrial and commercial settings.
1. The difference between a “risky” and “cautious” shift refers to the choice made by experimental subjects on the lowest acceptable probability for the risky play in question to be attempted. The probabilities range from “1 in 10 that the play would succeed” (risky) to “9 in 10 that it would succeed” (cautious).
Borden, R.J. 1980 Audience influence. In P. Paulus, ed., Psychology of Group Influence. Hillsdale, N.J.: Erlbaum.
Brown, R. 1988 Group Processes: Dynamics Within and Between Groups. Oxford, England: Basil Blackwell.
Cohen, J.L., and J.H. Davis 1973 Effects of audience status, evaluation and time of action on performance with hidden word problems. Journal of Personality and Social Psychology 37:822-832.
Dalkey, N.C. 1969 Analyses from a group opinion study. Futures 1:541-551.
1970 Use of self-ratings to improve group estimates. Technological Forecasting and Social Change 1(3):283-291.
Davis, J.H. 1969 Individual-group problem solving, subject preferences, and problem type. Journal of Personality and Social Psychology 13:362-374.
Davis, J.H., P.A. Bates, and S.M. Nealey 1971 Long-term groups and complex problem solving. Organizational Behavior and Human Performance 6:28-35.
Davis, J.H., T. Kameda, and M. Stasson 1991 Group risk taking: selected topics. In J.F. Yates, ed., Risk Taking Behavior. New York: John Wiley & Sons.
Delbecq, A.L., A.H. Van de Ven, and D.H. Gustafson 1975 Group Techniques for Program Planning: A Guide to Nominal Group and Delphi Processes. Glenview, Ill.: Scott, Foresman.
Diehl, M., and W. Stroebe 1987 Productivity loss in brainstorming groups: toward the solution of a riddle. Journal of Personality and Social Psychology 53:497-509.
1990 Productivity Loss in Idea-Generating Groups: Tracking Down the Blocking-Effect . Paper presented at the biennial meeting of the European Association of Experimental Social Psychology, Budapest, Hungary.
Dion, D.L., R.S. Baron, and N. Miller 1970 Why do groups make riskier decisions than individuals? In L. Berkowitz, ed., Advances in Experimental Social Psychology. New York: Academic Press.
Druckman, D., and P.T. Hopmann 1989 Behavioral aspects of negotiations on mutual security. In P.E. Tetlock et al., eds., Behavior, Society, and Nuclear War. New York: Oxford University Press.
Druckman, D., D. Solomon, and K. Zechmeister 1972 Effects of representational role obligations on the process of children 's distribution of resources. Sociometry 35:387-410.
Druckman, D., A.A. Benton, F. Ali, and J.S. Bagur 1976 Cultural differences in bargaining behavior: India, Argentina, and the United States. Journal of Conflict Resolution 20:413-452.
Dunnette, M.D., J. Campbell, and K. Jaastad 1963 The effect of group participation on brainstorming effectiveness for two industrial samples. Journal of Applied Psychology 47:30-37.
Festinger, L. 1953 Laboratory experiments. In L. Festinger and D. Katz, eds., Research Methods in the Behavioral Sciences. New York: The Dryden Press.
Festinger, L., and J. Thibaut 1951 Interpersonal communication in small groups. Journal of Abnormal and Social Psychology 46:92-99.
Fischer, G. 1981 When oracles fail—a comparison of four procedures for aggregating subjective probability forecasts. Organization Behavior and Human Performance 28:96-110, 133-145.
Foushee, H.C. 1984 Dyads and triads at 35,000 feet: factors affecting group process and aircrew performance. American Psychologist 39:885-893.
Geen, R.G. 1980 The effects of being observed on performance. In P.B. Paulus, ed., Psychology of Group Influence. Hillsdale, N.J.: Lawrence Erlbaum Associates.
1989 Alternative conceptions of social facilitation. In P.B. Paulus, ed., Group Influence, 2nd ed. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Gough, R. 1975 The effect of group format on aggregate subjective probability distributions . In D. Wendt and C.J. Vlek, eds., Utility, Probability, and Human Decision-Making. Dordrecht, The Netherlands: Reidel.
Hackman, J.R., and C.G. Morris 1975 Group tasks, group interaction process, and group performance effectiveness: a review and proposed integration. In L. Berkowitz, ed., Advances in Experimental Social Psychology, Vol. 8. New York: Academic Press.
Harkins, S.G., and J.M. Jackson 1985 The role of evaluation in eliminating social loafing. Personality and Social Psychology Bulletin 11:457-465.
Harkins, S.G., and K. Szymanski 1989 Social loafing and group evaluation. Journal of Personality and Social Psychology 56:934-941.
Hastie, R. 1986Review essay: experimental evidence on group accuracy. In B. Grofman and G. Guillermo, eds., Information Pooling and Group Decision-Making. Greenwich, Conn.: JAI Press.
Janis, I.L. 1972 Victims of Groupthink. Boston, Mass.: Houghton Mifflin.
1982 Groupthink: Psychological Studies of Policy Decisions and Fiascos. Boston, Mass.: Houghton Mifflin.
Kahneman, D., P. Slovic, and A. Tversky, eds. 1982 Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press.
Kerr, N. 1983 Motivation losses in small groups: a social dilemma analysis. Journal of Personality and Social Psychology 45:819-828.
Latané, B., and A.J. Arrowood 1963 Emotional arousal and task performance. Journal of Applied Psychology 47:324-327.
Latané, B., K. Williams, and S. Harkins 1977 Many hands make light the work: the causes and consequences of social loafing. Journal of Personality and Social Psychology 37:822-832.
Linstone, H.A., and M. Turoff 1975 The Delphi Method: Techniques and Applications. Reading, Mass.: Addison-Wesley.
Lorge, I., and H. Solomon 1955 Two models of group behavior in the solution of Eureka-type problems . Psychometrika 20:139-148.
Mahoney, R., and D. Druckman 1975 Simulation, experimentation, and context: dimensions of design and inference. Simulation and Games 6:235-270.
Marquart, D.I. 1955 Group problem-solving. Journal of Social Psychology 41:103-113.
McGrath, J.E. 1984 Groups: Interaction and Performance. Englewood Cliffs, N.J.: Prentice-Hall.
Mullen, B., and C. Johnson 1991 Productivity loss in brainstorming groups: a meta-analytic integration . Basic and Applied Social Psychology. In press.
Murphy, G., and L. Murphy 1931 Experimental Social Psychology. New York: Harper.
Myers, D.B., and H. Lamm 1976The group polarization phenomenon. Psychological Bulletin 83:602-627.
Osborn, A.F. 1957 Applied Imagination. New York: Scribners.
Rohrbaugh, J. 1979 Improving the quality of group judgment: social judgment analysis and the Delphi technique. Organizational Behavior and Human Performance 24:73-92.
Schachter, S. 1959 The Psychology of Affiliation. Stanford, Calif.: Stanford University Press.
Schachter, S., B. Willerman, L. Festinger, and R. Hyman 1961 Emotional disruption and industrial productivity. Journal of Applied Psychology 45:201-213.
Shaw, M.E. 1932 Comparison of individuals and small groups in the rational solution of complex problems. American Journal of Psychology 44:491-504.
Steiner, I.D. 1972 Group Process and Productivity. New York: Academic Press.
Stoner, J.A.F. 1961 A comparison of individuals and group decisions involving risk taking . Journal of Abnormal and Social Psychology 65:77-86.
Taylor, D.W. 1954 Problem solving by groups. In Proceedings XIV, International Congress of Psychology. Amsterdam, The Netherlands: North Holland Publishing.
Taylor, D.W., P.C. Berry, and C.H. Block 1958 Does group participation when using brain storming facilitate or inhibit creative thinking. Administrative Science Quarterly 3:23-47.
Torrence, E.P. 1954 Some consequences of power differences on decision-making in permanent and temporary three-man groups. Research Studies of the State College of Washington 22:130-140.
Tversky, A., and D. Kahneman 1974 Judgment under uncertainty: heuristics and biases. Science 185:1124-1131.
1981 The framing of decisions and the psychology of choice. Science 211:453-458.
Wallach, M.A., N. Kogan, and D.J. Bem 1962 Group influence on individual risk taking. Journal of Abnormal and Social Psychology 65:75-86.
Wrightsman, L.S. 1959 The Effects of Small-Group Membership on Level of Concern. Unpublished doctoral dissertation, University of Minnesota.
Zajonc, R.B. 1965 Social facilitation. Science 149:269-274.
1980 Compresence. In P.B. Paulus, ed., Psychology of Group Influence. Hillsdale, N.J.: Lawrence Erlbaum Associates.