The Panel on Human Factors Sciences at the Army Research Laboratory (ARL) conducted its review of selected research and development (R&D) projects of the ARL Human Sciences Campaign at the Aberdeen Proving Ground, Maryland, on June 27-29, 2017. The human sciences project areas reviewed were as follows:
- Real-world behavior. The objectives of the R&D in this area are to enable the collection, analysis, and interpretation of human behavioral data within dynamic, complex, natural environments. ARL conducts R&D in the following two areas: (1) real-world complexity in human science experimentation, and (2) assessing human behavior in the real world. A key focus of this work is the development of novel technology and methodologies and to collect and analyze these data in real-world conditions.
- Human variability. The goals of this R&D are to enable high-resolution, moment-to-moment predictions of an individual soldier’s internal and external behavior and performance and the ways in which soldiers interact dynamically in mixed-agent team and social settings in both training and operational environments. Human variability R&D is conducted in the following two areas: (1) multifaceted soldier characterization to develop a comprehensive understanding of the factors influencing human variability, and (2) brain structure-function coupling to create a multiscale understanding of the relationship between the brain’s physical structure, its dynamic neurophysiological functioning, and human behavior.
- Humans in multiagent systems. These efforts aim to achieve critical technological breakthroughs needed for future Army multiagent, mixed-agent teams to effectively merge human and agent capabilities for collaborative decision making and enhanced team performance in dynamic, complex environments. The challenges for human sciences R&D in this area are soldier workload, situation awareness, trust, influence, and cultural cognition.
- Human cyber performance. This program aims to advance a foundational science of cybersecurity that addresses the human dynamics of attacker, defender, and user interactions in Army networks to support training effectiveness and transition of agent-based technology to improve the operational efficiency and effectiveness of cyber-warfighters.
The Real-World Behavior (RWB) Initiative is an ambitious program that extends research from the laboratory to the field with the aim of understanding and predicting human behavior in naturalistic environments. It examines potentially unpredictable operational and stimulus environments, open behavioral responses, and potentially highly sensor-rich and pervasive monitoring. For this reason, one important focus is to develop novel technology and methods for sensed measures from neural and physiological data in the individual and the interactive behavior of individuals and groups in teams.
Accomplishments and Advancements
The progress of this group, including the choice of initial projects, is impressive. They have developed state-of-the-art enabling technology in the areas of electroencephalogram (EEG) and vehicle instrumentation, and other projects investigate potential approaches to analysis of complex data sets. The use of technical and quantitative methods and approaches to experimental investigations are at a high standard; however, important theoretical discoveries and approaches remain in the early stage of development. The portfolio has a good balance between work in the laboratory and in the field, sophisticated analytic approaches, and the preliminary development of theory. The group has significantly advanced its capabilities and agenda since the 2015 review,1 while demonstrating strong productivity and leadership roles in collaborative work and field consortia.
A goal of the RWB initiative is to approach this new science as an iterative process that first takes theories and findings from the existing literature and laboratory experimentation and embeds them into carefully instrumented and designed tests in real-world contexts. After harvesting new discoveries and theories from the necessarily less controlled—but more ecologically valid—real-world research, the second step would test these new ideas in carefully controlled detail in a new generation of experimental studies.
The existing projects, some reviewed here, pursue enabling technology, test models for analysis of RWB data sets, and test real-world inspired challenges in the laboratory setting. The group has made significant progress in enabling technology since the 2015 review. It is ready to collect data in several projects designed for the field with complex instrumentation for multisensor data. Several experimental laboratory projects are either inspired by the gaps between standard experimental testing environments and the real environment or designed to pretest and validate methods of analysis of data from ongoing neural and other sensor data in open behavior scenarios.
One of the unique features, shared among several initiatives, is the focus on developing technologies to enable real-world deployment of research. Several of the initiatives take advantage of the strength in materials sciences, chemistry, or computing sciences within the ARL—a combination of expertise that can be found in some academic centers but would be difficult to duplicate and coordinate within research environments in most industrial settings.
1 National Academies of Sciences, Engineering, and Medicine, 2016, 2015-2016 Assessment of the Army Research Laboratory: Interim Report, The National Academies Press, Washington, D.C.
Enabling technologies have been developed for EEG, a neural signal useful for indexing human states with potentially feasible field deployment. The group now has state-of-the-art expertise in EEG systems and source localization; it has developed in-house EEG technology and compared it with commercial EEG systems. Of particular note is a head phantom for EEG, a device approximating the human skull conduction used to re-create electrical signals on the scalp that will enable the modeling and exclusion of noise sources, with the goal of identifying measurable neural signals recorded in complex environments. The group has developed a cutting-edge facility for integrating EEG and other related neural and physiological sensor data.
The target for future work is to further enable technology. The number-one problem of recording EEG in the field will be powering the measurement and analysis equipment indefinitely. An important project created a novel integrated approach to amplification and digitization resolution in an online, adaptive manner that will enable low-power operation for applications such as EEG that are capable of being powered from locally harvested sources (e.g., thermocouple or solar power). This type of device is unique to the real-world recording environment and is currently not available commercially.
RWB data collected over extended time periods during complex human or human-agent behavior makes critical demands on data management and sophisticated data interpretation. The data may include multiple sensor readings, patterns of communications, behaviors, and behavioral outcomes. Important efforts are being made to create an approach for analysis and interpretation.
A data storage and integration system centered on EEG has been developed, including methods for synchronizing multimodal measurements and a standardized format for data. The system is extensible to pulse, galvanic skin response, respiration, accelerometer, position, velocity, vehicle state, text, natural language, and in some cases eye tracking (ET). It provides an approach to the aggregation and cross-referencing of big synchronized data streams. ARL staff have participated in and in some cases led the development of these standards through the Big EEG consortium.
Measuring neural responses during natural eye movements is one target of study. Real-world recording of EEGs contains several additional requirements above and beyond the typical laboratory environment. Until quite recently, saccades and fixation-related potentials have been either ignored or constrained using fixation points. Cost and complexity of ET equipment has been the primary limitation of merging these technologies. ET equipment has improved dramatically in the last 10 years. The number of manufacturers of ET equipment has increased, providing more accurate, faster, and portable equipment. Many ET equipment manufacturers now provide interfaces to a variety of EEG equipment. Likewise, EEG equipment manufacturers are starting to provide support for a variety of ET systems. However, these systems are aimed primarily at the larger marketing and product development community and are not ideal appropriate for the RWB goals of this team. Further technology for ambulatory ET and EEG remains to be developed.
Related laboratory projects examine eye-fixation synchronized EEG as a potential approach to understanding EEG with free viewing. Extrapolating from the classic but simplified gaze-constrained experimental measurements, neural signatures of target detection, attention, and human state are indexed to the onset of an eye fixation during task performance. These studies are beginning to yield important basic findings about the neural signatures of goal-directed behavior. Fixation-related neural potential is an emerging area of research, with very few published articles in the area, and this research can form the baseline study for further analysis. The research study relating the neural response to the saccadic distance traversed by the eye and the spatial frequency of the stimulus is a unique and valuable contribution to this nascent field. To the board’s knowledge, there are no previous studies on these critical components for real-world recordings.
Another pair of projects motivated by the gap between typical laboratory measurements and the typical conditions in the field study performance under high dynamic range (HDR) luminance conditions. These two projects, together, are an example of how this group is taking methods and results that have been studied in the laboratory, and testing how these methods and results generalize to much more realistic contexts. Human and machine vision has mostly been studied with stimuli and images that have limited dynamic range (computer display screens). However, human and machine vision systems could deal with much higher dynamic range (e.g., brightly lit scenes) in the real world. Visual target detection and visual search have been thoroughly characterized for low dynamic range stimuli using well-established visual psychophysics experimental protocols and computational neuroscience theories. The team is in the early stages of extending these empirical and theoretical methods to HDR stimuli. Only recently has this been made possible, because of the availability of HDR displays. The approach is technically strong, the topic is important, and the research is likely to have scientific impact.
Another cluster of projects focuses on shooting performance, a domain with obvious applications. The study of shooting performance as a function of weapon-ammunition configuration is an outstanding example of the group’s applied research. The objective was to measure performance (accuracy) degradation for single shot, controlled pair, and automatic burst sequences, across weapon systems with different recoil energy. The experimental design and implementation and the data analysis methods were technically sound. The results were clear and readily interpretable, offering a new metric for evaluating trade-offs between small arms systems. The project studying the role of human fatigue in the speed, accuracy, and commission of friendly fire commissions tested marksmanship while distinguishing between friendly and nonfriendly targets. This project showed that, while the accuracy of the firing was little affected, mental fatigue induced by previously completing a complex cognitive task significantly increased the number of subsequent friendly fire events—failures either to identify a friendly target as such or to withhold action in this rapid-fire context. This study is a dramatic demonstration of the importance of human state in these situations. Another field demonstration also showed the strong potential role of physical sensor systems that can support and guide marksmanship training.
The group has instrumented an automobile used to study authentic brain and behavior during driver-passenger interactions on a local highway (I-95). The goal of this research is to enable quantitative measurement of human behavioral performance and brain activity while an individual is driving in traffic on the highway. A passenger car was instrumented to record synchronized continuously streaming data about the state of the automobile (e.g., speed, lateral position in lane, and distance behind leading vehicle in the same lane), and the state of the driver (e.g., pulse, respiration, and EEG). Doing so is a technical tour de force and an example of the team’s technical capability for measuring human performance in real-world scenarios. The study is designed to have different types of controlled experimental manipulations (verbal communication between passenger and driver, amplitude-modulated podcast auditory stimulation). These are good examples of the team’s creative approach for introducing experimental manipulations in real-world scenarios. The audio podcast, for example, will enable measuring steady-state auditory evoked potentials, a rigorous and robust approach for analyzing EEG measurements. The data acquired with this platform will offer a test-bed for assessment and for further developing the team’s technical infrastructure for handling big data (database, workflow) and data analysis methods. The approach is technically strong, the interdisciplinary topics being addressed are important, and the research is likely to yield results that will have scientific impact.
One study, related to others studying the integration of humans in systems, examined situation awareness and preservation of communication ties in disrupted environments. This study investigated how social networks (here team networks) measured by communication acts morph as a result of disruptive events. Collected in the context of a joint (U.S.-U.K.) forces coalition training exercise, the project
observed changes in the patterns of e-mail communications in response to certain identified disruptive events, based on 20,000 e-mails from 87 individuals in specific roles. Communication ties reflect the numbers of outgoing and incoming e-mails between individuals when the e-mails may be more costly to generate than to receive. The study found that the active networks responded differently to different kinds of disruptive events. This study will be further developed to codify types of disruptive events and the corresponding predicted changes in communication network dynamics. This information may inform training activities or may even drive on-the-fly allocation of communication bandwidth in the field.
Some preliminary studies are focused on understanding the descriptive verbal communication, with the presumed goal of designing suitable agents for understanding human descriptions and commands.
Challenges and Opportunities
This area poses many challenges in collection, analysis, and interpretation of complex data. It provides many opportunities for important and timely developments in both basic research and field research.
There is a need to enhance the work of this early-state initiative, but if it matures properly, this could be a program of high potential for a leadership role in the scientific study of real-world behavior in natural environments, especially if leveraged with other laboratory and extramural resources. The ARL researchers have opportunities for leveraging research from broad technological partnerships and unique access to a target population that experiences ranges of human brain states involving fatigue, stress, and task complexity, both nationally and internationally. The RWB projects also have synergistic interactions with issues of human variability and the complex demands of multiagent teaming that are also the target of the Human Sciences Campaign research.
The group has done an excellent job of identifying important technological challenges and opportunities; almost all are high demand and high reward. In most cases, the challenges are also opportunities, if approached well, to take leadership positions. These include the following:
- To continue work on field-inspired issues such as high-luminance range inputs to vision and devices, EEG correlates of visual search in complex environments, and others.
- To further enhance enabling technologies for instrumentation in the field for real-time monitoring, including, for example, size, power consumption, onboard computing, and transmission and storage of data.
- To address the challenge of interpreting the dense and complex data generated by highly instrumented continuous collection. Solid field-normed approaches have been identified based on EEG standards, yet data management, indexing, and interpretation of these data sets pose challenges.
- To further develop theoretically grounded methods for using real-world events to mark event-related or reverse correlation analysis of neural or physiological data. A number of these were already identified—for example, the possible use of intersubject correlation to identify common states, event analysis by expert annotation, or training markers by machine learning in the laboratory and extending the classifiers to data acquired in real-world settings.
One major challenge and opportunity will be the development of technologies and algorithms that allow real-time data analysis of the relevant neural or physiological measures to support online state analysis. Another major challenge and opportunity is the identification of efficient or minimal sets, with appropriate redundancy, in the multimodal neural and physiological collection.
Many of the efforts within the projects and topic areas aligned with the objectives of two ARL-defined essential research areas: human agent teaming and accelerated learning for a ready and responsive force. The human variability portion of the Human Sciences Campaign is divided into two main projects: Individual Soldier State Dynamics, which aims to develop a more comprehensive understanding of how the interaction of both trait and state factors account for human variability; and Contextual Influences on Soldier Performance, which aims to quantify the influence of social and environmental components on human performance, characterizing how they modulate a soldier’s state dynamics.
Accomplishments and Advancements
Overall, there is a wide variety of research being conducted by a multidisciplinary team of well-trained researchers.
The number of publications and presentations at scientific meetings has increased over the past several years. The quality of the journals and scientific meetings at which the material is being presented is also very good. The efforts that have been made to decrease time for internal review appear to be working well. The new ARL online operations security and public release process begun in June 2017 will hopefully further streamline approval of publications and presentations to lower the barriers to publishing nonclassified findings in the scientific literature.
Since the 2015 review, there has been progress in developing equipment and techniques that will move the research forward. Two such examples are the optical targeting system, which has been developed and used in a study of the impact of fatigue on shooting accuracy, and the EEG technology, including the soft sensors and the EEG phantom.
Sleep is an important contributor to human variability. Individual differences in sleep duration, sleep need, and response to sleep loss have been characterized in numerous research studies, and incorporating such knowledge into ongoing studies of human variability as well as real-world behavior, monitoring sleep and fatigue of participants in all studies, and integrating sleep into the theoretical framework for understanding variability in human performance and behavior at ARL is well-justified and overdue.
There are good collaborations with the broader scientific community under ARL’s Open Campus Initiative. One such example is the connectomics collaboration with the University of Pennsylvania and the Institute for Collaborative Biotechnologies/University of California, Santa Barbara; such studies would not be possible at ARL without outside collaborators.
Challenges and Opportunities
Human variability is a critical area of investigation, because there are differences between individuals in how they respond to a particular challenge depending on a variety of factors, including (but not limited to) environmental factors such as temperature, social factors such as role within a unit, behavioral factors such as relative level of fatigue, and physiological factors such as stress level. While group-based approaches provide a useful starting point, if systems are designed for average or below-average human performance, this leaves a lot of untapped human potential.
Adaptive systems that can be trained to assess the individual’s traits and monitor the individual’s current state and then respond in real time to dynamic changes in state will allow future Army systems to capitalize on human uniqueness and mitigate against both intra- and interindividual variability. Such adaptive approaches can provide capabilities to improve physical, cognitive, and social performance; to
decrease time-to-train; and to improve human-network interactions by providing robust predictions of soldier state and intent to integrate within teams and tactical networks. These adaptive and predictive technologies will be critical to the individualization of equipment and training, maximizing and sustaining both soldier and unit peak performance during mission critical tasks.
As noted earlier, the number of well-trained early-career investigators is impressive. However, there is a notable lack of senior working scientists. To draw an analogy with a typical university structure, there would be many postdoctoral researchers, lecturers, and deans at a typical university, but few or no associate and full professors. This suggests that within the group studying human variability, oversight, coordination, and a view of the “big picture” may be lacking, and the researchers may have limited opportunities to advance their careers within the ARL structure; if they leave for other opportunities to advance their careers elsewhere, their talent and experience may be lost to ARL.
There is a need to connect the science of each project to an overall theoretical foundation. There is supposed to be a balance between field studies, laboratory studies, and the underlying theoretical basis of each. However, the presentations overall did not reflect the theoretical basis for the individual experiments. For example, one of the expressed goals was to develop a laboratory setting where high-resolution, long-term continuous monitoring of individuals can be carried out. While building such a facility and equipping it are interesting and challenging activities, how such a facility and the resulting data fit into the overall goals of the human variability program was not well articulated.
There was discussion of monitoring parameters of ARL personnel working in its laboratory setting as an approach to studying human variability. Related to this large-scale experiment, a plan was described to do similar monitoring of ARL or other personnel at other sites, including potentially at sites of academic partners and other Army research facilities. However, how to extrapolate from civilian scientists as test subjects working 9-to-5 weekday jobs to soldiers making life-or-death decisions in extreme environments (or if it is even possible) could be considered before, not only after, this activity is undertaken. Additionally, ARL lacks experience in carrying out multisite trials, and scientists with such expertise need to be part of any planning and execution of such an ambitious multisite study.
In the current experiments and in the experiments planned for the large-scale study, one challenge will be to identify the type, amount, and duration of data needed to predict the outcomes of interest. It will be easy to collect a lot of data, but if those data are not useful in predicting outcomes of interest, they will add burden to the future soldiers without providing any benefit. A plan needs to be implemented for how key features of data will be selected from the mass amounts of information, with such a plan undergoing regular updates as new equipment, analyses, and information is developed.
It is commendable that ARL recognizes that individual differences in sleep duration, sleep need, and response to sleep loss are important contributors to human variability and represent an opportunity to account for missing sources of individual differences. However, while recognizing that sleep needs to be accounted for, the current approach of ARL and the current group of ARL scientists do not have the expertise to move this forward as quickly as it could be done. It is challenging for scientists to enter a new area of research, and key to doing so in the most effective way is to seek expert advice. However, selecting the experts to provide advice is challenging when one has insufficient knowledge of the area in which one is choosing an advisor. While ARL has a current sleep advisor, that researcher’s focus is on how naps improve memory. This expertise is far too narrow for the broad program of human variability that ARL is currently working on, and ARL requires a wider range of sleep expertise to advise its human variability program.
Related to integrating information about sleep into the ongoing studies to understand how it contributes to variability, biological time-of-day (circadian rhythmicity) needs to be accounted for in all human studies. In particular, the researchers need to seek to understand how the particular types of performance
or the physiologic measures in their experiments vary with time of day. Human biology is designed for rest and inactivity at night, yet soldiers need to perform their duties at all times. This has huge impacts on all aspects of performance, yet it is not considered at all in the current ARL experiments. The speed of processing information, visual perception, balance, selective attention, short-term memory, as well as most biological processes, all vary with time of day. Recognizing this is a first step in understanding how it impacts performance, and that in turn can be used to explore how time-of-day impacts can be overcome when operational demands require all-day/all-week activities. It is therefore critically important to recognize that time of day plays a very strong role in both intra- and interindividual differences in performance, and to incorporate this into ongoing studies. This means that it is not sufficient to carry out experiments only during the daytime, and it is critical to understand the individual biological timing of participants in all studies so that time-of-day factors can be accounted for. Outside collaborators who are expert in studying circadian rhythms in performance and behavior could be consulted or brought on as partners to ensure that this source of variability is considered in experimental design and accounted for in analyses.
ARL is carrying out numerous studies of human performance, including, for example, shooting ability, driving, brain function (studied via EEG and magnetic resonance imaging), and decision making. It would be very useful if there were certain sets of information collected from every participant in every study, so that common data collected in different studies can be combined for larger-scale analyses, and so that future data mining activities could be carried out using existing data sets. Baseline questionnaires consisting of key information could be given to all participants in all studies; individual studies may want to add additional questions to meet their particular study goals. The baseline questionnaires could be reviewed from time to time to determine whether additional questions could be added as new information is learned on factors that contribute significantly to variability.
ARL urgently needs to incorporate cutting-edge biomedical research in genetics, epigenetics, the various “-omics” fields (e.g., metabolomics, proteomics), microbiome analysis, and biomarkers into all ongoing experiments. These sciences will very likely provide critically important information about sources of intra- and interhuman variability. The ideal way to pursue this is through a series of collaborations, whereby experts in the various areas provide advice about samples and data collection, ARL scientists collect the data, the collaborator analyzes the samples, and together ARL and the collaborator interpret the findings. It will be critical to select the right collaborators to ensure that this effort is carried out effectively and across the various studies. A scientific advisory board could be established to support this important effort.
Collaborating with academic researchers can extend the range of techniques available to ARL, and expert collaborators can allow ARL researchers to rapidly expand their work into areas in which they are not expert. That said, it is critical when moving into new scientific areas that the right collaborators are selected.
The Humans in Multiagent Systems program focuses on problems related to understanding and supporting distributed soldier collaborations in sociotechnical networks exemplified in three broad areas of inquiry: human-agent teaming, cyber and networked systems, and understanding sociocultural influences.
Accomplishments and Advancements
Throughout the description of the research in this area, there were many examples of good interdisciplinary collaboration to support broad-ranging questions among computer scientists, cognitive scientists, and human factors psychologists and engineers, such as in the work on trust in robotic transportation systems. Another example is in the work on mission command in the age of network-enabled operations, which represents an excellent effort that brings together innovative, multimodal methods and experimentation in the context of a real application domain for understanding situation awareness. This is a good example for other potential large-scale R&D studies that integrate empirical, technological, and theoretical advances. The collection of data to support further investigation into Army-specific communication patterns (e.g., based on rank) shows good foresight.
The Humans in Multiagent Systems team leverages the foundational (6.1) research of its colleagues in the Human Variability and Real-World Behavior programs to inform its applied (6.2) work as well as collaboration with operational warfighters, at Fort Bragg, for example. This serves as a good example of how cooperation across areas can greatly facilitate progress from basic questions to final products.
The identification of the Cyber Human Integrated Modeling and Experimentation Range Army (CHIMERA) lab, jointly developed by ARL’s Human Research and Engineering Computational and Information Sciences Directorate (CISD) for technical cyber research as a target of opportunity for research into the human aspects of cybersecurity, represents foresight into outreach and collaboration with other organizations, and it leverages investments made elsewhere.
The researchers demonstrated a clear understanding of the importance of measures of performance and measures of effectiveness, that achievement of the former does not always translate into achievement of the latter, and that this disconnect needs to be addressed in their research. They seek to become leaders in the development of teaming metrics to more precisely gauge the impact of the tools they are introducing into human systems and to differentiate when performance improvements are occurring as a result of teaming changes or other factors (such as the environment or the contributions of exceptional individuals).
Challenges and Opportunities
The Humans in Multiagent Systems program is very broad and ambitious. It can become a mechanism for bringing together a diverse and multidisciplinary group to look at an important set of Army-relevant problems. On the other hand, the topics of human-agent teaming, cybersecurity teams, and sociocultural influences in civil affairs teams are only very loosely connected, and some areas of expertise are lacking. It is not clear whether ARL is gaining significant advantages by the decision to join these areas within one program.
While the program has done a good job of presenting an overall map connecting identified needs with research questions, it still remains unclear what the conceptual and theoretical map is that guides researchers’ specific choice of research projects and questions. For example, in looking at intelligent agent-assisted route planning, why were mental models and trust selected as variables of interest? Or in investigating reasoning in civil-military affairs, what is the connection to the existing work on natural language processing, visualization, and intelligent decision making, and how have the researchers decided that their particular variables are appropriate? Given the ultimate goal of improving the lives of soldiers, paying attention to what controls the most variance in outcomes and selecting research questions is important.
The ARL has an opportunity to become a leader in the area of human-agent teaming. There are some very strong examples of approaches for evaluating consensus among humans and between humans and autonomous agents, such as the work on mental models in agent-assisted route planning and the use of eye tracking in evaluating input from an autonomous vehicle. Autonomous systems researchers expressed interest in adapting autonomous system reasoning to mission criteria, potentially resulting in more effective autonomous systems. However, guidance on variable selection and determining where variance in human-agent teaming really resides are important considerations that seem to be lacking. For example, there seems to be a large focus on trust, but no sense of why this plays such a large role across studies to the exclusion of other variables.
In addition, the broader area of artificial intelligence differentiates between technologies for production and those for coordination; ARL researchers do not seem to be considering this distinction in thinking about what technologies to explore and the implications of this distinction for this work. ARL could also try to think even bigger and more creatively about ways that autonomous agents might fundamentally transform the way the Army and civilians do their work in the future. It is unclear how much the research work is guided by these bigger, forward-looking ideas versus driven by current concerns of soldiers and making incremental improvement to current approaches to carrying out missions.
The inclusion of sociocultural influences and civil affairs as a research area in the general ARL portfolio represents a good opportunity for creating an end-to-end program. Creating tools to support the gathering, processing, and visualization of data collected by Army civil affairs specialists (and facilitating stakeholder decision making) appears an important and relevant application domain that can benefit from new advances in data sciences and machine learning to create scale-up and impact. There is a great opportunity for creating an end-to-end R&D program that can leverage the multidisciplinary expertise and approaches while positively impacting the interface between civil affairs and operational commands. This effort can benefit by laying out some of the high-level open gaps and needs. Finding a way to build on what is known already about civil affairs challenges to advance knowledge seems to be in a fairly elementary phase at this point. Similarly, the qualitative research being conducted seems to be fairly rudimentary compared to standards of similar research conducted in academia, and if this methodology will be central to the efforts in this area, additional depth may be required.
A significant challenge for all of the research is adapting to the specific demands of operating in the Army environment. The methods for investigating behaviors and tools similar to those used in civilian contexts, such as e-mail and electronic chat, need to take into account that the relationships between individuals vary considerably, constrained by rank, unit, and role. These relationships almost certainly impact communication patterns and networks to make them behave differently from those reported in extant literature on social networks. The data collected on network-enabled operations relating Internet protocol (IP) address to individuals offers the opportunity for significant research into Army-specific patterns and networks. There is also an opportunity to explore how soldiers work outside the constraints when those constraints impede critical mission effectiveness. In areas where ARL research is similar to commercial research, it is critical that the ARL work leverage existing work as much as possible, and that it then focuses ARL research to address Army-specific situations and needs. For example, robotic transportation and autonomous driving is an area that is robustly funded and researched commercially. ARL would benefit from focus on areas that are not of commercial interest—autonomous driving over rugged terrain, for example.
A related challenge is translating technology into the military environment. Soldiers are typically not early adopters of new technologies; they tend to reject new technologies that do not immediately demonstrate significant improvement over existing capabilities. Converting autonomous systems from tools to teammates is challenged by potential soldier expectations that autonomous systems will adapt
to the individual soldier rather than the soldier adapting to the system. It is not clear what assumptions guide the development of technologies at ARL, and whether thinking about technology adapting to soldiers might enhance the design of the research to raise the chances that the ultimate products created will be accepted by the target population.
There is a great opportunity for researchers at ARL to establish themselves as thought leaders and to create impact by sharing unique data and tools, along with papers. Creating a repository online to make papers and tools publicly available would facilitate this process.
Over the past year, the small Human Cyber Performance group has established a bold mission: advance a foundational science of cybersecurity that addresses the human dynamics of attacker, defender, and user interactions in Army networks to support training effectiveness and transition of agent-based technology to improve the operational efficiency and effectiveness of cyber warfighters. The group’s primary focus is on the cyber analyst, and it wants to become a leader in the cybersecurity community by advancing scientific understanding and improving human performance of human cyber analysts in their critical role in defending real-world Army networks and systems.
Accomplishments and Advancements
To that end, the group has built the CHIMERA lab. This lab is in effect a human cyber range, and its primary purpose is to enable the conduct of extensive studies on the cyber analyst in a simulated cyber environment that will allow the researchers to understand cyber analyst performance in a range of simulated situations. The lab consists of multiple chambers with displays that are attached to a simulated network environment that allows the researchers to configure and reconfigure networks of their choosing. Additionally, they are building a capability to provide a wide range of information to the analysts regarding the real-time status of the network that would resemble the sorts of output they would receive from standard network monitoring tools. The goal of the lab is to be able to assess the performance of cyber analysts in a way that has previously not been available to the cybersecurity community. The lab was under construction and is not yet operational, so there are no reported results from the lab.
Two recent empirical studies were performed prior to the creation of the CHIMERA lab. The first was a study designed to understand the extent to which past incidents, reported by cyber analysts, had the ability to predict future cyber incidents. Using a Bayesian forecasting model, they showed that analyst-reported incidents from week n had the ability to predict with high confidence the rate of incidents in week n + 1. The model was not as effective at other time intervals, and it was not intended to predict type of incidents, only their rate of occurrence. The second study looked at the performance of cyber teams (rather than a single cyber defender) as the team undertook the task of defending a simulated network. The researchers worked with teams of student cyber defenders who were participating in the Mid-Atlantic Collegiate Cyber Defense Competition. Using a combination of questionnaires and wearable social sensors, the teams were scored on three measures: services availability, incidence response, and scenario injects. The researchers showed that team leadership and team interaction were predictors of performance on the scored measures, although performance varied depending on the measure that was scored, suggesting that team leadership and composition can produce better or worse performance depending on the task that is to be performed.
Challenges and Opportunities
External to ARL, there has been growing recognition of the importance of human factors and usability research in the security area. Interdisciplinary “usable security” researchers work on a wide range of problems, including understanding the security behaviors of end users, improving the usability of authentication and access control tools, helping lay people keep their home computers secure, and improving the usability of security tools for system administrators and cybersecurity analysts. There is now a well-established annual conference in usable security (Symposium on Usable Privacy and Security) and numerous workshops in the area. In addition, the top conferences in security and in human-computer interaction regularly accept strong usable security research papers. A number of academic departments around the world are offering courses in this area, and there are now university and corporate research laboratories dedicated to usable security.
The nascent ARL cybersecurity research effort in the Human Sciences Campaign shows great promise, but it suffers from not having a critical mass of researchers dedicated to this effort. The group has done a lot with the resources it has and has developed successful collaborations with cybersecurity researchers in other parts of ARL and externally, but its current staffing level limits both the breadth and depth of its research.
The ARL cybersecurity effort in the Human Sciences Campaign focuses primarily on studying and improving the performance of teams of cybersecurity analysts. With the recent construction of the CHIMERA lab, ARL now has an opportunity to go deeper into this space. The CHIMERA lab is a unique research facility, and ARL has more ready access to trained cyber analysts than typical academic laboratories would have. The currently articulated research goals for the CHIMERA lab focus on developing measurement techniques for assessing the performance of cyber analyst teams and the technologies they use. The current research is largely observational and exploratory. However, plans are in place for more quantitative experimental studies. Longer term goals include using data collected from these studies to inform the design of decision support tools for analysts.
A more ambitious research agenda might be framed around using cyber analysts to improve network protection. This might include the examination of the performance of individual cyber analysts and teams of cyber analysts to determine what behaviors lead to the best network protection outcomes and what human tasks could benefit from automation (using a computer to complete rote tasks without the need for human intervention) and from decision support (using a computer to assist human tasks and decision making). This research may begin as exploratory observational work that leads to hypothesis formation. The CHIMERA lab could then be used to stage controlled experiments to test these hypotheses. Additional research might examine various approaches to individual and team cyber analyst training to determine what training approaches lead to the best outcomes. As new tools for cyber analysts are developed and the Army considers adopting commercial tools, research could investigate the effectiveness of these tools in improving network protection outcomes.
There are also significant opportunities to conduct cybersecurity research that goes beyond the study of cybersecurity analysts and could establish ARL as the preeminent research lab for behavioral cybersecurity. Current staffing levels limit ARL’s ability to broaden efforts in this area, but additional staff would allow for a broader research agenda. There are many open research questions in this area that may be significant to the Army. For example, there are open research questions around how to securely authenticate personnel and grant access to physical and electronic resources in a variety of real-world field situations while minimizing disruption of the current task. In addition, a systematic analysis of security breaches impacting the Army could reveal underlying human-related security problems that would benefit from a concerted research effort.
To meet these opportunities requires increased staffing levels and an interdisciplinary team that includes cognitive psychologists, human-computer interaction researchers, and security researchers. In addition, the team would benefit from expertise in machine learning to aid in development of human behavior prediction models and decision support tools. The team would benefit from interaction with external researchers at the top usable security, security, and human-computer interaction (HCI) conferences (e.g., SOUPS, CHI, CSCW, CCS, IEEE S&P, and USENIX Security) and aiming to publish its research at these conferences. These conferences are useful to attend even when the team does not have papers to present. The team would also benefit by interacting with other government agencies doing research in this area, including the National Security Agency (NSA) Science of Security effort.
This initiative is an ambitious and promising entry into the challenging field of the measurement and analysis of real-world behavior. Overall, the technical quality of the work is high. In particular, the group has worked to identify technical and theoretical gaps and to align resources to solve specific needs. The technical quality of enabling technology and instrumentation was especially high. In general, the group uses strong experimental techniques and appropriate modeling approaches. There has been a continued improvement in research products, including published papers, chapters, technical reports, and conference papers. Still, as new research areas are broached, the work would benefit by consultation with appropriate experts.
The analytical abilities and techniques of the Human Variability program in general are strong. The EEG-related technical expertise is excellent. The source localization methods being developed are interesting and a good approach to go beyond simple subtractive methods of analysis.
The Human Variability projects have made progress since the last review by continuing to publish findings in the scientific literature and present findings at conferences.
Humans in Multiagent Systems
Overall, the technical quality of the work is good, and methodologies that are used to explore the research questions are appropriate.
One area that lacks depth is qualitative research. The work on sociocultural influences in particular seems heavily dependent upon the use of qualitative methods, but the methods used for conceptualizing research questions and analyzing and presenting data need some additional expertise to be brought up to the standards of similar academic research. If this will be an important part of sociocultural work on civil affairs, then more depth will be needed here.
Another area where more depth is needed is in the area of teams research. In some areas, particularly the work on teams in cybersecurity, the lack of deep expertise on current literature is leading to slow progress in the research. The opinion of some ARL researchers is that knowing nothing about the area of cybersecurity teams reveals a lack of knowledge of other teamwork research. Abstracting the issues of cybersecurity teams to think in terms of complex, fast-paced decision making in the face of adversarial pressure would reveal some relevant and useful literature to build upon. The research team
appears to be starting from ground zero in developing its own observations of how teams work together on problems, and then developing its own frameworks to make sense of observations.
A related area that could benefit from more expertise is in multilevel theory and analysis. Ultimately, to translate the findings across campaigns into actionable conclusions will require integrating findings from individual-level research into teams and higher levels of analysis. Much of the research presented was at the individual level of analysis; the few examples of teams research made no use of information on individual differences, which would undoubtedly affect how teams operate. Simply consulting with researchers with experience in integrating across levels of analysis is not adequate, because this perspective is likely to change how the individual level research is designed. Better integration across levels of analysis will be needed.
Human Cyber Performance
Because this line of research is in its infancy, it is very difficult to assess the technical quality of the work at this time. With that said, it was clear that the researchers are beginning to come up to speed in cybersecurity as they continue to collaborate with their technical counterparts who have a deeper understanding of the technical components of the cyber mission. This collaboration will be essential as the team advances its vision to develop a human science of cybersecurity.
General Conclusions and Recommendations
There is a need across the Human Sciences Campaign to link the science of each project to an overall theoretical foundation. There needs to be a balance and an identifiable interaction between field studies, laboratory studies, and the underlying theoretical basis of each. This balance and interaction would be facilitated by clarifying linkages within projects in a given project area, across projects in the campaign, and with projects in other ARL campaigns.
Recommendation: The Human Sciences Campaign should develop a clearly articulated plan to achieve a balance and an identifiable interaction among field studies, laboratory studies, and the underlying theoretical basis of its experimental and field studies. The plan should include a strategy for linking the theoretical underpinnings of projects in a given project area, across projects in the campaign, and with projects in other ARL campaigns, and should identify the mechanisms for monitoring and guiding formulation and implementation of such linkages, including, where appropriate, the participation of external advisors.
In a number of important areas across the programs, the in-house expertise is weak. One promising approach to achieving the necessary levels of expertise and to achieving an effective balance and interaction between experimentation, field studies, and their theoretical underpinnings would be the establishment of external advisory committees for each project area, as needed, and, perhaps, across the Human Sciences Campaign. Other means could include attendance at relevant workshops and conferences, mentoring by and exchange visits with outside experts, participation in consortia, and recruiting senior scientists.
Recommendation: The Human Sciences Campaign should consider establishing external advisory committees for each project area, as needed, and, perhaps, across the Human Sciences Campaign. Other means should include attendance at relevant workshops and conferences, mentoring by and exchange visits with outside experts, participation in consortia, and recruiting senior scientists.
There is not a formal program of mentorship and career development within the Human Sciences Campaign at ARL, and this represents a missed area of opportunity. In particular, new and junior researchers at ARL need regular career guidance and mentorship from more senior and experienced ARL scientists and administrators through a formalized program. A program for identifying mentor(s) and establishing, reviewing, and updating a career development plan would help to develop the research workforce and ensure that staff members reach their full potential. A mentoring program will not only benefit the individuals in question, but it will also benefit ARL.
Recommendation: The Human Sciences Campaign should establish a mentorship program for early-career staff that includes guidance about how to establish a research niche within ARL, how to choose internal and external collaborators, how to navigate the ARL bureaucracy, when and how to seek additional training, and other career development activities.
ARL represents a unique setting for researchers. It is well funded, offers exceptional facilities, carries out interesting and challenging projects, and has numerous partners in academia, in industry, and within the military research arena. These features can make ARL an attractive place for researchers to work, especially those who are starting their careers. This could make recruiting new expertise easy, but only if the wider research community is aware of ARL and the opportunities to work there.
Recommendation: In addition to its laudable Open Campus Initiative, ARL should consider other outreach efforts—especially those targeted at scientists in new areas that ARL seeks to enter—that may include activities such as booths in exhibit halls at scientific meetings and circulating specific job or grant opportunities to universities and industry with large programs in the areas of interest.
The work in the Real-World Behavior program seems narrowly focused, currently, on the behaviors inherent in driving and marksmanship.
Recommendation: To enhance the generality and richness of its research approaches, the group should perform an analysis of the range of Army-relevant real-world behaviors and expand its focus to include study of behaviors additional to those associated with driving and marksmanship.
Given the centrality of the role of sleep, fatigue, and stress as identified issues in the target real-world applications and in the military, it would be helpful to find mechanisms such as workshops, scientific mentoring of outside experts, or exchange visits with outside experts and their laboratories to provide training opportunities and consulting in these areas, where in-house expertise is not strong. Alternatively,
recruiting one or more senior scientists in these areas could be considered if the centrality of the topic warrants.
Recommendation: The Real-World Behavior group should consider improving its expertise in the areas where in-house expertise is not strong—such as sleep, fatigue, and stress—by such means as attendance at relevant workshops and conferences, mentoring by and exchange visits with outside experts, participation in consortia, and recruiting senior scientists in these areas.
Recommendation: The Real-World Behavior group should consider developing an advisory board of scientific experts to provide advice on topics relevant to the major initiatives.
To be successful, the Real-World Behavior large experiment projects will require resources and expertise in the areas of big data, data management, work flow, machine learning, and human-computer interaction
Recommendation: The Real-World Behavior group should consider reaching out to information sciences or other computational resources, perhaps within ARL, to look for opportunities for collaboration in the areas of big data, data management, workflow, machine learning, and human-computer interaction.
The Refresh program has brought a competitive scientific agenda to new projects and provided leadership opportunities for midlevel scientists.
Recommendation: The Real-World Behavior group should consider continuing the Refresh program.
The Human Variability group lacks a framework for entering new research areas or integrating new aspects of research into experiments. When doing so, it is important to know what you don’t know, have a plan for seeking expert advice, and select the right experts. This seems to be lacking, and it can lead ARL to attempt to reinvent the wheel rather than integrating the current science into its ongoing programs. This theme permeates the following conclusions and recommendations.
It is important to provide opportunities for the researchers to attend scientific meetings and conferences in scientific disciplines related to new scientific initiatives (e.g., sleep, circadian rhythms, and genetics). Attending such meetings and conferences will allow the researchers to understand the state of the science in disciplines outside their areas of expertise, and to network with academic and industry scientists who are active in these areas. This would also provide ARL with greater visibility within the scientific community.
Recommendation: The Human Variability group should expand opportunities for the researchers to attend scientific meetings and conferences in scientific disciplines related to new scientific initiatives (e.g., sleep, circadian rhythms, and genetics).
There is a strong group of well-trained early-career investigators working on human variability questions. However, there is a lack of senior scientists working on this effort, and perhaps for this reason it was difficult to get an overall perspective of how each experiment fits together.
Recommendation: The Human Variability group should seek ways to bring in senior scientific expertise to coordinate the efforts of the human variability experiments, ensure that duplication is minimized and overlap is utilized effectively, and mentor the junior scientists working on human variability and related sciences.
Three scientific areas that are sources of important intra- and interindividual variability are sleep, circadian rhythmicity, and genetics/genomics.
Recommendation: The scientific areas of sleep, circadian rhythmicity, and genetics/genomics should be integrated into all ongoing human variability experiments.
Sleep has a pervasive impact on human variability, but the ARL efforts in the area of sleep are suboptimal. A scientific advisory board comprised of several sleep experts with differing areas of expertise (e.g., collection of sleep data from field settings, impacts of sleep on cognitive performance, impacts of sleep on physical performance, sleep and shift work/long work hours) could assist in integrating the measurement and study of sleep in all their studies related to human performance. The group would benefit by bringing in a senior sleep expert as a visiting scientist/senior fellow to provide sleep expertise to the ongoing projects (even those that are not yet focusing on sleep) and by collaborating with a sleep research group that has a broad range of expertise. The Human Variability group could consider having some of its existing personnel train off-site in leading sleep research laboratories to gain knowledge in sleep. The group could also consider hiring at least one new postdoctoral fellow with training in sleep and performance.
Recommendation: The Human Variability group should seek appropriate sleep expertise in the form of a scientific advisory board, hiring one or more sleep-performance experts or collaborating with a multidisciplinary sleep research group with particular expertise on sleep recording in field studies, sleep and performance, and individual differences in sleep duration, need, and response to sleep loss.
Recommendation: The Human Variability group should consider having some of its existing personnel train off-site in leading sleep research laboratories to gain knowledge in sleep, and the group should also consider hiring at least one new postdoctoral fellow with training in sleep and performance.
Circadian rhythmicity (biological time-of-day) is an important source of human variability.
Recommendation: The Human Variability group should incorporate circadian rhythmicity into all ongoing experiments and should seek to collaborate with experts in the field of human circadian rhythmicity.
The cutting-edge biomedical research areas of genetics, epigenetics, “-omics” (metabolomics, proteomics), and biomarkers could provide critically important information about sources of intra- and
interhuman variability, but they are not a focus of any ARL experimentation. It is important that these cutting-edge areas of biomedical research be integrated into ongoing and new studies as soon as possible by establishing the right series of collaborations. Senior-level scientific oversight of this effort is needed to ensure that this is done effectively and across the various studies. This may require a scientific advisory board or boards.
Recommendation: The cutting-edge biomedical research areas of genetics, epigenetics, “-omics” (metabolomics, proteomics), and biomarkers should be integrated into ongoing and new studies as soon as possible, and this should be accomplished by achieving senior-level scientific oversight, perhaps with the support of scientific advisory boards.
Humans in Multiagent Systems
The Humans in Multiagent Systems group needs additional expertise in teams research theory and methodology. The group also needs additional expertise in qualitative research in order to raise the level of rigor with which those methods are employed in the sociocultural differences area.
Recommendation: The Humans in Multiagent Systems group should hire another researcher with deep expertise in teams research theory and methodology and should add expertise in qualitative methods.
The publications of the Humans in Multiagent Systems group were not evident during the review. The publications would demonstrate research into prior work.
Recommendation: In future reviews, the Humans in Multiagent Systems group should provide more examples of publications and should provide impact factors of those publications.
Because of the need for research to be Army relevant, project presentations need to identify the study populations used so that reviewers can evaluate the degree to which findings will generalize to an Army population.
Recommendation: In future reviews, the Humans in Multiagent Systems group should identify the study populations used.
Keeping human-agent teaming, cybersecurity, and sociocultural difference research projects in the same program may be creating areas where necessary depth is lacking in favor of generalists who are working across areas.
Recommendation: The Humans in Multiagent Systems group should, in collaboration with the Human Sciences Campaign leadership, examine whether keeping human-agent teaming, cybersecurity, and sociocultural difference research projects in the same program is facilitating progress in each of these important areas.
Ensuring that an appropriate proportion of research projects are oriented toward “outside the box” ideas helps to push the envelope of how Army operations could be carried out from a human and tech-
nology perspective. This could both increase the creativity and impact of the work as well as enable the group to become more of a force for change within the broader Army organization.
Recommendation: The Humans in Multiagent Systems group should examine its portfolio of projects and assess whether an appropriate proportion of research projects are oriented toward “outside the box” ideas.
The researchers did not clearly identify the conceptual and theoretical reasons for the study designs they are pursuing, nor the ultimate Army-related need they hope to satisfy; these clarifications are necessary to make clear that they are pursuing research on variables that will have high impact.
Recommendation: The Humans in Multiagent Systems group should clearly identify the conceptual and theoretical reasons for the study designs it is pursuing and the ultimate Army-related need it hopes to satisfy.
Enhancing its online presence in providing data, tools, and findings to other researchers would enhance the Humans in Multiagent Systems group’s scientific leadership in the community.
Recommendation: The Humans in Multiagent Systems group should enhance its online presence in providing data, tools, and findings to other researchers.
Human Cyber Performance
The breadth and depth of expertise in the Human Cyber Performance group is not sufficient to achieve its stated objectives.
Recommendation: The Human Cyber Performance group should increase the number of staff working in the area of behavioral cybersecurity research, adding both junior and senior team members with interdisciplinary expertise in cognitive psychology, human-computer interaction, cybersecurity, and machine learning.
ARL behavioral cybersecurity researchers need to interact with external researchers at the top usable security, security, and human-computer interaction (HCI) conferences (e.g., SOUPS, CHI, CSCW, CCS, IEEE S&P, and USENIX Security), and need to aim to publish their research at these conferences. These conferences are useful to attend even when the team does not have papers to present. The team would also benefit by interacting with other government agencies doing research in this area, including the NSA Science of Security effort.
Recommendation: The Human Cyber Performance group should interact with external researchers at the top usable security, security, and HCI conferences (e.g., SOUPS, CHI, CSCW, CCS, IEEE S&P, and USENIX Security) and should aim to publish its research at these conferences.
Recommendation: The Human Cyber Performance group should interact with other government agencies doing research in this area, including the National Security Agency’s Science of Security effort.
The Human Cyber Performance group does not evince clearly articulated research goals in the behavioral cybersecurity area.
Recommendation: The Human Cyber Performance group should better articulate the goals of its research efforts in the behavioral cybersecurity area, focusing on improving security outcomes.
The Human Cyber Performance group research is largely observational and descriptive; it is not clearly aimed at providing explanatory and predictive power.
Recommendation: The Human Cyber Performance group behavioral cybersecurity researchers should think through how to expand their research to go from offering observations and descriptions to providing explanatory and predictive power.
The breadth of the research of the Human Cyber Performance group is limited and does not reflect clear definition of its research niche vis-à-vis that pursued by academic and university researchers.
Recommendation: The Human Cyber Performance group should expand the breadth of its research in the behavioral cybersecurity area, emphasizing areas of strategic interest to the Army that are not receiving much attention by academic and university researchers.
The Human Cyber Performance group would benefit significantly by forming a multidisciplinary external scientific advisory board that can provide advice and counsel as it starts and continues to advance this new, vitally important scientific endeavor.
Recommendation: The Human Cyber Performance group should consider forming a multidisciplinary external scientific advisory board that can provide advice and counsel.