4
Visual and Psychomotor Factors in Display Design

In a working environment that is arguably the most dangerous of any current profession, image intensification and thermal imaging have extended the normal perceptual capabilities of the soldier and allowed vision to operate in conditions in which the unaided eye would be ineffective. The proposed helmet-mounted display is designed to allow various information sources to be displayed in front of one eye on a single screen, thereby reducing the time required to switch from one source to another.

A great deal is known about the human visual system and its strengths and limitations in a variety of conditions. The scientific evidence regarding visual and psychomotor factors is among the most critical the panel has assessed. In this chapter, we identify several human factors issues that should be carefully considered by the system's designers.

In our examination of visual and psychomotor attributes of helmet-mounted displays, we begin with an overview of the proposed hardware for the Land Warrior helmet-mounted display and a discussion of its intended uses in enhancing soldiers' awareness of their environment. We follow this with the advantages and disadvantages of such displays for the infantry soldier. Of particular concern is that the display may degrade or even block out information about the local environment that is normally available through the unaided eye; it may, because of its weight, reduce mobility; and its use may result in spatial disorientation and dizziness. Next we describe the research base on a series of visual factors to be considered in designing and assessing display devices. These factors include: field of view and resolution, binocular versus monocular viewing, visual perception of the world and pictures, and depth cues. The discussion of depth cues



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 65
--> 4 Visual and Psychomotor Factors in Display Design In a working environment that is arguably the most dangerous of any current profession, image intensification and thermal imaging have extended the normal perceptual capabilities of the soldier and allowed vision to operate in conditions in which the unaided eye would be ineffective. The proposed helmet-mounted display is designed to allow various information sources to be displayed in front of one eye on a single screen, thereby reducing the time required to switch from one source to another. A great deal is known about the human visual system and its strengths and limitations in a variety of conditions. The scientific evidence regarding visual and psychomotor factors is among the most critical the panel has assessed. In this chapter, we identify several human factors issues that should be carefully considered by the system's designers. In our examination of visual and psychomotor attributes of helmet-mounted displays, we begin with an overview of the proposed hardware for the Land Warrior helmet-mounted display and a discussion of its intended uses in enhancing soldiers' awareness of their environment. We follow this with the advantages and disadvantages of such displays for the infantry soldier. Of particular concern is that the display may degrade or even block out information about the local environment that is normally available through the unaided eye; it may, because of its weight, reduce mobility; and its use may result in spatial disorientation and dizziness. Next we describe the research base on a series of visual factors to be considered in designing and assessing display devices. These factors include: field of view and resolution, binocular versus monocular viewing, visual perception of the world and pictures, and depth cues. The discussion of depth cues

OCR for page 65
--> presents a tentative framework for a program of testing, evaluating, and improving military visual displays. We then discuss the value of training in overcoming visual and perceptual distortions. The final section presents our conclusions and design guidelines. INTRODUCTION Hardware Configuration The display in the Land Warrior System is initially to be an opaque screen, with a 40 degree field of view displayed on a monochrome 640 x 480 active matrix electro luminescence (AMEL) display, positioned about 1 inch from the wearer's eye. The display is monocular, leaving one eye available to view the ambient environment. The optics and display are to be suspended from the helmet, with the image intensifier located either on the helmet or in line with the optics.1 Several human factors issues need to be considered in evaluating this design. First, there are ergonomic issues related to placing additional weight on the helmet and ensuring that the display is stable with respect to the head; we discuss these issues in detail in Appendix A. In addition, the monocular display, limited resolution coupled with field of view, and off-axis location of sensors have important implications for perceptual and perceptual-motor performance. Functions of the Helmet-Mounted Display In the Land Warrior System, helmet-mounted displays are to serve several functions, the most important of which is to display the output of devices designed to enhance soldiers' perception of their environment. These include the night vision system and the thermal weapon sight. The night vision system amplifies ambient illumination and allows soldiers to see night environments that would be essentially invisible to the unaided eye. The thermal weapon sight uses the heat differences between objects and their backgrounds to produce a thermal image of the environment. This image can be useful at night as well as during the day, when smoke and other obscurants can make targets difficult to see with the unaided eye. In addition, the device can display messages regarding danger and troop movements, as well as information useful for navigation, such as maps and location as determined by the global positioning system (GPS). It should be noted that all of the information listed above can be displayed on devices other than a helmet-mounted display. For example, night vision goggles, 1   The influence of bandwidth constraints on image quality and refresh rate must be accounted for in the proposed wireless transmission system.

OCR for page 65
--> thermal sights, and GPS are currently used to good effect by the Army. The potential advantage of the helmet-mounted display is to integrate this information on one display, facilitating rapid switching between various sources of information as circumstances demand. For example, superimposing symbology on the night vision display would allow users to switch back and forth between these two information sources without making head movements, large eye movements, or changes in accommodation. Similar advantages apply to a case in which the soldier must rapidly switch from using the night vision system for movement across a terrain to acquiring a target with the thermal weapon sight. The use of helmet-mounted and head-up displays in aircraft provides some insights into the potential advantages and disadvantages of this display technology. The Army has pioneered in the application of helmet-mounted displays in aviation, with various night vision devices and sensors feeding into the Integrated Helmet and Display Sighting System (IHADSS). It has developed an extensive body of research data on visual performance with sensor displays (Foyle and Kaiser, 1991; Bennett et al., 1988; O'Donnell et al., 1988), human factors and safety problems (Brickner, 1989; Hart and Brickner, 1987; Rush et al., 1990), and field experience with visual illusions (Crowley, 1991). These analyses demonstrate both the great potential and the risks of using helmet-mounted displays. For example, Wickens and Long (1994) have shown that head-up displays do provide an advantage to pilots in terms of staying on course and instrument landings. However, they have also shown that pilots using a head-up display are more likely to miss occasional, low-probability events, such as an aircraft moving onto the runway during an approach for landing. This may be due partly to the cluttering effect of the symbology's being superimposed on the image of the outside world, as well as attentional conflict between the near and far information domains (Fischer et al., 1980; Hoffman and Mueller, 1994; McAnn et al., 1992; Neisser and Becklin, 1975; Wickens and Long, 1994; Wickens et al., 1993). The use of helmet-mounted displays by the infantry soldier, however, poses its own particular set of constraints that may be different from those encountered in the cockpit. Because the soldier is mobile, the issue of providing a stable base for the display becomes even more important than it is in the cockpit, making helmet fit and weight critical issues (see Appendix B). In addition, part of the advantage of head-up displays in the aircraft is due to symbology that can be made conformal with various aspects of the scene (Weintraub and Ensing, 1992). A symbolic runway with associated symbology can be superimposed on an actual runway scene, which helps to integrate the two sources of information and reduce attentional interference (Wickens and Andre, 1990). It is difficult to see how this sort of conformal mapping between symbology and scene features could be achieved in the infantry environment. It is therefore important to analyze the use of helmet-mounted displays within the context of the physical and task environments in which the infantry soldier operates. For example, the Land Warrior System, with its associated soldier radio,

OCR for page 65
--> may allow a squad to operate in a more dispersed fashion that reduces its vulnerability. However, night use of a monocular display may at times reduce the ability to detect a camouflaged ambush site because of a loss of certain depth cues, such as stereopsis (especially if small head and trunk movements are not made to compensate through parallax for the loss of stereopsis). This problem of target detection is in a sense amplified by the greater speed of movement afforded by the Land Warrior System. Thus, overreliance on the visual system and speed of advancement through the terrain could reduce squad attentiveness to other cues. If an ambush occurs, the squad's speed of execution of the counterambush drill may be reduced because of both the time needed to orient to the enemy and the narrow field of view. As to the presentation of symbolic data in the Land Warrior System, for some tasks it may be better to place the helmet-mounted display screen off the visual axis or use a hand-held device, which would require the wearer to shift gaze in order to access the information on the screen but might more than balance this cost by reducing clutter. It is important to determine when it is advantageous to present information superimposed on the scene image and when it may be better to provide other displays. Advantages and Disadvantages of Display Devices The introduction of devices that provide remote or local information in the form of enhanced sensory or symbolic displays may in the proper circumstances contribute greatly to the safety and effectiveness of the infantry soldier. However, the specific means proposed in each case may interfere with the acquisition or use of sensory information, depending on the circumstances, and all such devices are associated with certain general problems. We introduce such problems briefly here and discuss them in more detail in the next section. First, as we discuss in this section, helmet-mounted displays may degrade or even nullify information about the nearby environment that is normally available through the unaided senses; see the report on the Soldier Integrated Protective Ensemble (SIPE) (U.S. Department of the Army, 1993). They may distract attention in ways that may have a critical effect on some tasks, interfering with the user's situation awareness. Even design factors that may be unimportant under less demanding conditions may seriously contribute to a soldier's workload under combat conditions. For example, the SIPE squad and team leaders reported to our panel a situation in which they were unable to see an ambush target even though the target presented itself on multiple occasions. The squad positioned itself further from the kill zone (concentrated area of fire) because they felt secure in their ability to observe the site. One possible explanation of why the squad was unable to detect the target is that their attention was distracted: they reported diligently observing the kill zone, which meant that they were focused on an area. If the target passed

OCR for page 65
--> outside the area of narrowed attention, they may never have noticed it, even though it was in their field of view. Second, through excessive fragility, bulk, and weight, the equipment by which remote or sensor information is displayed may seriously reduce the mobility of the dismounted infantry soldier; it may also add fatigue and heat stress (U.S. Department of the Army, 1993). In aviation helmet-mounted display equipment, the heavy and off-center optics increase fatigue and headaches, and the close-fitting helmet liners used to hold the optics in position may also increase heat stress. In the infantry, these physical problems are an even greater cause for concern: greater fatigue can be expected because active infantry soldiers do not receive support from a seat, and the equipment may interfere with their ability to move and to take cover rapidly. The physical effects of the equipment may have perceptual and cognitive consequences as well. Spatial disorientation can be expected; with no external support, such as a seat, a soldier is not provided with tactile information about bodily orientation to help counteract any disequilibrium due to the helmet-mounted display. Because the weight, weight distribution, and configuration of some displays interfere with the free head movements that a soldier would otherwise rely on to obtain the visual information that is intimately tied to normal action and locomotion in the environment, the equipment-based deficits offered to the infantry would seem to be considerably more serious than those offered in aviation. These two sets of issues, the sensory/perceptual and the ergonomic, not only are problems in themselves, but also may interact in counterproductive ways. They must be kept in mind by both the equipment designers and the users. Cost-benefit analyses after appropriate testing-tests whose results apply to the situations in which the equipment is to be used-should precede commitment to any mode of proposed enhancements and to the means by which they are achieved. Table 4-1 summarizes the major benefits and costs of key factors of helmet-mounted displays as well as the key research and testing issues. Before examining visual factors in detail, it is useful to compare the potential side effects of the proposed helmet-mounted display with those found in others currently being developed. Much of the recent work on the effects of helmet-mounted displays has focused on their use in creating virtual environments (VE). In VE applications, the user is emersed in a synthetic environment that differs from the real-world environment. Experiences in VE involve remote synthetic images of scenes, auditory displays, and apparent head and body motion. The current state of the art in VE technology permits display of relatively sparse image geometry (supplemented by "wallpaper texture"), updated at low rates (usually less than 30 Hz), and displayed more often than not in a biocular format. Current head and body tracking systems, which are required to synchronize the displayed scene with user movements, have hysteresis problems, are slow, and are inaccurate at the limits of the operating envelope. The result is often a low-

OCR for page 65
--> TABLE 4-1 Display System Features, Human Performance Considerations, and Research Issues Control or Display Device Sensory and Ergonomic Considerations   Benefits Costs Head/helmet-mounted display (general issues)   Always available Does not have to be held in the hand or manipulated Can easily be aligned on target or terrain feature Wide field of view Can be used to guide movement Added information improves situation awareness of medium to long-range environment   Added weight on head Off center CG More complex and fragile than hand-held display Precision/alignment requirements more severe Wide field of view results in inadequate resolution Display information content may overload or distract user, reducing situation awareness

OCR for page 65
--> Visual Design Research Test Approach Visual Test Conditions Test Criteria   Acquisition/application of information on a visual display terrain targets map data overlaid symbology/ text overlaid cursors/reticules placement (location on display) of information information density (and clutter) time for sequencing text, other data display switching (IR/I2)   Laboratory/bench technical test   Controlled light conditions, controlled display conditions, synthetic images Assess optical and display parameters (e.g., FOV luminance) Assess off-axis viewing, distortion, off center display, etc. Display prototype data formats, realistic targets at varied ranges and aspects to determine peak performance in optimum conditions Use head tracker to assess search head movements   Percent correct and time to detect Identify targets and terrain features Place reticule on target Percent correct and time to acquire and apply displayed information     Controlled user field experiments   Measured day/dusk/ night lighting, conditions controlled display conditions, synthetic and real images, real targets at a controlled distance, camouflage, image stability, information legibility while moving, distracting and/or masking effects of HMD on assessing real targets, varied user population to assess peak performance in known conditions   Percent correct and time to detect Identify targets and terrain feature Place reticule on target Percent correct and time to acquire and apply displayed information Effects of mobility upon display usability (especially off- axis viewing, interference

OCR for page 65
--> Control or Display Device Sensory and Ergonomic Considerations   Benefits Costs Monocular helmet-mounted display   Minimum weight Simplest HMD; less alignment required Eye with no display remains dard adapted Eye with no display continues to sample real world Severe visual rivalry problems. such as target suppression (involuntary) and ''cognitive switching" CG is off sideways as well as forward Smallest FOV; least information capability; more and larger head movements required No depth information Difficulty to navigate on uneven terrain

OCR for page 65
--> Visual Design Research Test Approach Visual Test Conditions Test Criteria       Assess optical and display parameters (e.g., FOV, luminance) Assess off-axis viewing, distortion, off center display, etc. Display prototype data formats, realistic targets at varied ranges and aspects to determine nominal performance in known conditions Use head tracker to assess search head movements with real world situation awareness) Effects of ambient conditions on HMD information delivery, interaction with local environment     Operational field testing   Assess stress, fatigue, varied information content in operational tasks in a field exercise with/against soldiers with conventional equipment   Effective use of information, success and time to conduct operational tasks dependent upon HMD data, interference of HMD on local SA   General HMD issues, plus: Effects of visual rivalry, loss of stereo Effects of smaller field of vision (FOV) with respect to visual search, reduced information content, more emphasis on format of data   Laboratory/bench technical test   As general HMD issues   As general HMD issue

OCR for page 65
--> Control or Display Device Sensory and Ergonomic Considerations   Benefits Costs Biocular helmet-mounted display   Wider FOV, more information, easier to navigate No interocular rivalry Less complex to adjust than binocular   Heavier than monocular Poor resolution Incorrect depth information Isolates user from environment

OCR for page 65
--> Visual Design Research Test Approach Visual Test Conditions Test Criteria     Controlled user field experiments   Assess possible visual fatigue, field experiments disorientation, postural stability/loss of coordination   As general HMD issues plus: Effects on postural stability, navigational/ vestibular orientation     Operational field testing   Assess stress, fatigue, varied information content in operational tasks in a field exercise with/against soldiers with conventional equipment   Effective use of information, success and time to conduct operational tasks dependent upon HMD data, interference of HMD on local SA Effects on orientation, attention fatigue and possible perceptual adaptation with longer-term usage   General HMD issues plus; Effects of anomalous stereo/parallax upon target assessment, mobility   Laboratory/bench technical test   As general HMD issues   As general HMD issues     Controlled user field experiments Operational field testing   As general HMD issues Use head tracker to movements Assess stress varied information content in operational tasks in a field exercise with/against soldiers with conventional equipment   As general HMD issues Effective use of information, success and time to conduct operational tasks dependent upon HMD data, interference of HMD on local SA

OCR for page 65
--> TABLE 4-3 Rankings of Information Sources by the Areas under Their Curves in Figure 4-1 within the Three Kinds of Space   Action Space     Source of Information Personal Space All Sources Pictorial Sources Vista Space 1. Occlusion and interposition 1 1 1 1 2. Relative size 4 3.5 3 2 3. Relative density 7 6 4 4.5 4. Height in visual field and height in the picture plane —a 2 2 3 5. Aerial perspective and atmospheric perspective 8 7 5 4.5 6. Motion perspective and motion parallax 3 3.5 -- 6 7. Convergence 5.5 8.5 -- 8.5 8. Accommodation 5.5 8.5 -- 8.5 9. Binocular, disparity, stereopsis, and diplopia 2 5 -- 7 a Dashes indicate data not applicable to source. Assuming that the processing limits cannot be exceeded, adding additional sensitive units to the sensor and switching them in and out as desired (trading off density in one region against another) would be one way to achieve this, and probably relatively inexpensive. In general, there are different kinds of evidence that the lower part of the visual field is more important for detailed functions, and that the visual system is equipped with more specialized contour-sensitive mechanisms, than the upper field (Previc, 1996; Rubin et al., 1996). More specifically, as we discuss below, depth perception and manipulation, especially in the absence of binocular vision, is not well served by the existing resolutions. In addition, the restricted field of view (30-40 degrees) could be quite dangerous, because of its negative effects both on situation awareness and on stitching together successive narrow glances at an active and cluttered environment (see examples 1 and 2 below). The 640 x 480 array resolution, although sparse for the central 4-8 degrees of central vision, is much higher than is needed for peripheral vision (the ambient system), and some redistribution should make it possible to increase the field of view to something between 50 and 60 degrees by lowering peripheral resolution. (Luminance modulation could be used to obtain higher effective subpixel resolution; such enhancement might help for some environmental tasks and detract in others, so that real and simulated field tests are important.) The additional margin will not help in obtaining information through eye movements, but it should prove useful when the soldier relies on head move-

OCR for page 65
--> ments, as when walking or surveying the surroundings. This display format should probably be elective. Zone / Example 1: Disarming mines, cutting wires, adjusting sights, applying first aid, setting fuses, clearing weapon malfunction, etc. Two issues are detection and depth perception (and aligning parts in depth as well as in the frontoparallel plane). Close detail. Assume the following situation: a distance of about 0.5 meter, wires or other parts as small as 1/8 inch, and horizontal display lines (or pixel rows) of about 0.063 degree = 3.75 min (30 degree 480 lines). The 1/8 inch subtends 23 minutes, or is about 8 pixels high and is well above both the 1 min minimum separable for spatial resolution at adequate contrast and the pixel size (3.75 min/pixel) limit of the Land Warrior System. Given the parameters, such features should be visible almost out to 1 meter, at which point the pixel size should exceed the image size, and visual confounding and loss of detail should become a factor. Depth localization and 3D form. With only monocular viewing, the depth perception needed to align parts in the third dimension would most naturally come from small head movements. With lateral head movements of about 1 inch, depth differences of about 1/4 inch would be needed at 18 inch (0.5 meter) distance, and about 1.25 inches at 1 yard/meter. Most fine manipulation tasks are conducted within this range or a bit less, and this resolution would likely limit task precision. With twice the resolution in the lower half of the display field, all of these tasks would probably be feasible at 0.5 meter and some even at 1 meter. Example 2: hand-to-hand combat, breach obstacles, detect branches and handholds, operate controls. At a range of about 1 meter, the field of view should be less than 2 ft. Limbs, weapons, and branches are safely above spatial resolution, but shoulders, limbs, and most of the target body fall beyond the field of view. Something like a 55 degree field of view would include the opponent's head, shoulders, and arms and at a half-normal resolution should then be enough for the ambient system. Zones 2 and 3 A critical activity occurring in Zones 2 and 3 involves object detection, recognition, and identification (see Technical Note). A soldier needs to detect the presence of another person in the distance, recognize that person as friend or foe, and in some cases identify the individual. The bases for these object recognition tasks are unknown, but we attempt to use some simplifying ideas, similar to the basis of Biederman's geon theory, to provide estimates of what kind of performance might be expected using limited resolution displays. The main assumption of the geon theory is that object recognition is accomplished by recognizing combinations of a small set of component forms. We

OCR for page 65
--> have no reason to believe that there is a single system underlying object recognition or a single set of criteria. Indeed, it seems reasonable to assume that different classes of objects are recognized and/or identified by different criteria in different zones. Viewed from 1 meter or less, with a field of view of 30 degrees, a tank's overall silhouette would seem unobtainable. Conversely, viewed from 75 meters, the largest of some soldier's component features (e.g., a 1 inch nose, seen in profile) subtends little more than 1.3 min, whereas the minimum separable angle for reading letters is taken as 1.0 min and, more to the present point, the minimum pixel size is close to 4 min. To pick up the contribution of the nose is barely possible with good unaided vision alone at 75 meters and is not possible with a device of the Land Warrior System's resolution. This is not to say that discerning the target's nose is necessary or sufficient to recognize the soldier (some other feature or clusters of features may be needed). However, if the presence of the largest feature (whatever it may be) cannot be detected, then smaller ones become irrelevant. Having considered a few examples of resolution effects on specific tasks, we turn to a more general survey of the effects of helmet-mounted displays on depth information. Effects of Helmet-Mounted Displays on Depth Information Helmet-mounted displays, like night vision devices, capture optical information about the environment and present it visually to the wearer. If binocular sensors are used, stereopsis may be enhanced (if somewhat distorted) by increasing the separation between them, extending depth information from Zone 1 into Zone 2 (i.e., intermediate distances). If only a single sensor is used, stereopsis is necessarily lost with monocular or biocular head-mounted displays, and accommodation loses all differential depth information in all displays because the light is collimated. Head-motion parallax is seriously distorted whenever the sensors are at a different optical location from the eyes (as in virtually all head-mounted displays) and is lost when a viewer must remain stationary. Without stereopsis or parallax, a viewer is left only with interposition for nearby depth information, and that information is severely limited; objects' images must be overlapping to provide it, and at most it tells only which surface is the nearer. Because a viewer who must remain stationary but who is concerned more with intermediate and far distances than with near distances must in any case depend chiefly on the pictorial depth cues (see Figure 4-1), the loss of stereopsis and the distortion of head-motion parallax that is imposed by most helmet-mounted displays may not represent significant additional costs. That can only be true, however, if the equipment provides the depth cues in a condition that is adequate to the perceptual needs of the task. Such devices are not equal to what the unaided eye receives under good viewing conditions, but more graded assess-

OCR for page 65
--> ments are needed for making design decisions, particularly under degraded viewing conditions. There are data that offer some guidance as to the equipment needed for stereopsis (in the case of binocular displays) and for parallax, obtained as functions of luminance, contrast, and resolution, using very simple displays involving wires, dots, and gratings (Schor, 1987; Schor et al., 1984, Foley, 1987). The pictorial depth cues, however, are another matter. At the most basic level-as two-dimensional patterns-there are various data showing that the exposure time and contrast needed to detect targets vary with their size, background luminance, etc. These could be applied to features of the depth cues that seem amenable to such analysis. For example, textural gradient and local occlusion may be presented to the sensor by the environment, but they are probably very readily lost at low resolutions; shading cues and the intersections that provide interposition information are degraded or lost with a sparse gray scale; height in field and convergent linear perspective, both of which depend on some extended region of the display, must at some point be degraded as field of view decreases, as aperture-viewing studies confirm (Hart and Brickner, 1987). The effects on these attributes of any display must therefore be assessed before deciding to use the equipment in any task that is likely to call heavily on those cues, and effort should be made to relate, extend, test, and apply the results of such task-oriented analyses. The same issue arises in the context of computer-generated graphics. One function of the display equipment (including hand-held displays, which do not directly interfere with normal visual perception of the world) is to present maps, diagrams, and charts. Experimental evidence suggests that maps and diagrams that incorporate depth cues may be more effective than traditional displays (see, e.g., Bemis et al., 1988; Burnett and Barfield, 1991), depending on the task and on the parameters and combinations of cues (Ellis et al., 1991; McGreevy and Ellis, 1986). The pictorial depth cues (and stereopsis and motion-based depth information as well) can, in principle, be successfully constructed on computer graphics displays, but they can be used by a viewer only to the degree that the sensory qualities of the display permit. The enhancement and simplification techniques that are available to computer-generated images can provide more robust information, but limitations in field devices such as resolution, contrast, and field of view must still be evaluated as to their effects on depth cues from these artificial sources. Most familiar classes of objects can be recognized with extreme rapidity even in the absence of any depth information, cued only by their shapes in the display (Biederman, 1985; Peterson et al., 1991). Performance tasks that do not require any specific depth perception-but that can be carried out by recognizing some object(s) or layout (e.g., the presence or direction of a person or group, of a particular kind of equipment, a particular house, etc.)-are therefore probably not badly degraded by the absence of depth information. Moreover, objects' familiar

OCR for page 65
--> sizes can actually act as depth cues, although such distance information takes longer to extract (Predebon, 1992). Like the depth cues, however, the perception of the objects as shapes necessarily depends on the quality of the display, the field of view, and how the situation prepares or primes the viewer. In a display that is too coarse to resolve the features that characterize a given object, or with gray scale and contrast inadequate to model its forms, shapes may not be recognizable. Conservative estimates of what tasks can be performed are certainly possible to make (see discussion on action zones). But object recognition cannot be predicted solely from any table of data because objects (and depth cues as well) are normally highly redundant. That is, a part or a feature (or even just an attribute of some object, like its color) may serve instead of the whole object, given an appropriate past or present context (Hochberg, 1980). With training and familiarity, therefore, performance may surpass what would be expected from such tables. Conversely, a given point of view may obscure the relevant features even when display quality is otherwise adequate. As a consequence, trade-off assumptions about any specific equipment need to be tested in the real or simulated missions for which it is intended. Effects of Helmet-Mounted Displays on Field of View With a small field of view, some or all of the redundancy in an object or scene is very likely to be lost, because only some portion of either may be included within the display. Moreover, peripheral vision is greatly decreased by the field of view for most displays proposed. This will interfere with object recognition, because the information in peripheral vision normally informs a viewer where to look next in order to obtain some needed feature. In the division of perceptual labor between what have been called the ambient and focal visual systems (Hughes et al., 1996; Leibowitz et al., 1982; Schneider, 1969; Trevarthen, 1968), which are roughly equivalent to peripheral and foveal vision, it is the former that contributes most heavily to orienting (e.g., attentional capture), visual guidance of the limbs, and posture (orientation or vection) (for recent reviews, see Hughes et al., 1996). In normal vision, we bring only a few points in the layout around us to the fovea, relying on the ambient system for the remainder. Yet, the Land Warrior device is one that uses focal visual information, but it has to be integrated with the operator's requirement for carrying out ambient visual activities. Peripheral vision also provides landmarks as to where some detail that was previously fixated (i.e., was clearly seen in central vision during a previous glance) lay relative to the feature presently being fixated, and these landmarks are likely to be unavailable within a small field of view. A viewer can compensate to some extent for a small field of view by making more head movements to sample the environment. But successive small glimpses of the environment that are obtained by such movements (which are much slower and more cumbersome than eye movements, especially with heavy helmet-mounted displays in place)

OCR for page 65
--> can provide information about the entire object, or scene, only if they are effectively stitched together in memory, which is not necessarily possible in cluttered or unfamiliar settings. There is currently no single accepted cognitive theory from which one can set the bounds of an individual glimpse. For example, are successive views of a display ''directly" placed by the visual system into a single perceptual setting, without passing through any memory-like encoding process, just so long as there is enough structural overlap between the views? If so, which seems doubtful (Hochberg and Brooks, 1996), how much overlap must there be? Over how much delay? Over how many shifts of view? Regardless of theory, it is known that reduced fields of view reduce a viewer's ability to grasp things and to maneuver within the visual environment. According to both a large body of research and common sense, objects and scenes can be identified more rapidly and more correctly when a viewer has been previously set or primed by those objects or by the categories to which they belong (Biederman, 1985; Bachman and Alik, 1976). The context in which an object appears, if it is an appropriate context, can serve much the same function (Biederman, 1981), but that depends on a field of view sufficient to provide that context. Reduced fields of view eliminate or reduce the context and thereby its facilitating effects. There may be some minimum field of view below which a wearer will be unable to achieve a coherent grasp of the context, even by making the successive head movements discussed above; this is suggested by aperture-viewing studies and by examining motion picture use of "establishing shots" (Hochberg, 1986). At a narrow field of view, the facilitating effects of the context on object recognition may be lost, and the context is often necessary for accurate perception. Soldiers commonly are required to drop to the ground, roll rapidly, and survey their surroundings. It seems likely that the effects of narrow fields of view (and protruding eyewear) may require special training on such tasks and warnings about specific vulnerabilities. When it is important for a viewer to have a ready grasp of where people and things are distributed within the visual environment (which must often be the case with infantry soldiers), the higher cost and weight of displays that go with wider fields of view may be unavoidable. Only controlled research under field (or field-like) conditions can inform that decision. In any case, because the sequence of visual queries (e.g., successive glances and head turnings) is elective-depending on a viewer's task, knowledge, and attention as much as on the information provided by the visual display at each step in the sequence-one must consider the situation that the viewer needs to grasp and the factors that affect such situation awareness.

OCR for page 65
--> TRAINING As mentioned earlier, problems with the monocular display in the Apache helicopter were at least partially alleviated by training. There are at least three quite different areas in which specific training may help infantry soldiers use the helmet-mounted display, and the effectiveness of such training should be evaluated: Performance of manipulations and locomotion under the offsets described should be practiced, and relearning of behavior during the aftereffects of protracted sessions should be pursued to familiarize the soldier with the existence and nature of the aftereffects. To execute certain tasks, soldiers will have to substitute head motion parallax for binocular stereopsis in order to gain depth information in Zones 1 and 2. Similarly, they will have to substitute search through head movements for search through eye movements because of the reduced field of view. Training to criterion in several critical tasks similar to what must be done in the field (e.g., setting fuses, replacing pins in grenades, clearing weapon malfunction in Zone 1 and detecting approaching threats in Zones 2 and 3) may help decrease the costs of these informational losses. Objects and terrain seen through these devices, especially narrow field of view thermal imaging, do not present the familiar perceptual units that so quickly and seamlessly serve to build our normal visual world. It is more like recognizing planes by radar signatures, but trying to do so in the course of rapid movement through a cluttered environment. Fortunately, practice with the purely visual task of recognition and identification can be obtained as much as is necessary using recorded and/or simulated displays. How effective such training is, and how much is needed, are questions for research. CONCLUSIONS AND DESIGN GUIDELINES The Land Warrior System aims to increase both the effectiveness and the survivability of infantry soldiers, using technologies that include portable computers, satellite navigation, light amplifying and thermal sensors, and both helmet-mounted and hand-held displays. Although such technologies can certainly enhance performance under certain conditions, they incur costs and risks as well. The pros and cons of each innovation must be considered in balance, to avoid net reductions in safety and capability. Evaluations must be informed by real or simulated research in the field; they should not be based solely on analyses of human abilities in laboratory situations. Field research to test the effectiveness of this equipment has only recently begun. To be effective, the research must be directed toward conditions under which the net benefits from specific sensory enhancements are of questionable

OCR for page 65
--> value. On the basis of the relevant research literature, this chapter summarizes what planners need to know in order to assess the benefits and costs of the major proposed enhancements. We believe that carefully designed field testing, guided by the kinds of human factors issues that are raised here, together with the concerns expressed by those who use the equipment, will be needed continually as this program evolves. The proposal to use a monocular display appears to be motivated by the lower weight and cost of this configuration, as well as the desire to maintain dark adaptation in one eye. Our review has pointed out that the monocular display may result in rivalry, which can induce fatigue and disorientation. In addition, stereo depth information will be lost, which is an important depth cue when contrast is poor and obstacles are within 30 meters. Field tests tend to support this concern. We recommend that a binocular display be seriously considered and further field tests be conducted to evaluate the effects of display configuration on a variety of soldier tasks. Our analysis suggests that a variety of depth cues are degraded by limited display resolution and field of view. This in turn should impact task performance within the three different depth zones of action. We recommend that field studies be conducted to determine how resolution and field of view affect performance in the three zones. In addition, training in making head movements and scanning patterns, which may partially alleviate these problems, should be investigated. Thermal imagery presents a special challenge to the soldier's visual system because many of the usual cues to depth and shape available in visible light are absent in thermal images. Once again, training may be particularly important in the successful use of the thermal images. The effects of long-term use of monocular displays are unknown. This issue should be investigated before a monocular configuration is adopted. The use of the helmet-mounted display for maps and other symbology may be problematic. Symbology tends to produce clutter and may interfere with the perception of the sensor imagery. Maps and certain other kinds of symbology might be better displayed on a hand-held or wrist-mounted device. The use of off-axis sensors, such as the image intensifier mounted on the helmet, may produce a variety of illusions, disorientation, and aftereffects. This placement should be avoided if at all possible. Placement of additional weight on the helmet raises concern over fatigue, increased physical workload, and related increases in cognitive workload. The helmet-mounted display should be evaluated under the demanding physical conditions in which these interactions are likely to occur.

OCR for page 65
--> TECHNICAL NOTE: VISUAL ACUITY AND RESOLUTION The preferred way to describe the minimum size target that can be seen is in terms of the visual angle it subtends at the viewer's eye in units such as arc minutes (Ogle, 1953). Many other units are also in use (see Figure 4-2). One measure is visual acuity, which is usually defined as the reciprocal of the target size in arc minutes (of subtended visual angle). One implication of the visual acuity unit is that normal vision corresponds to 1 arc minute (equivalent to Snellen acuity of 20/20). In clinical practice it is common to use the Snellen fraction. The numerator of this fraction is usually taken as 20 and the denominator (usually in multiples of 10) is the range at which a young viewer with no visual abnormalities or dysfunction could discriminate alphanumeric characters that the testee can see at 20 feet (e.g., if your vision is 20/200, that means you need to be at a viewing distance of 20 feet to see letters a "normal" viewer could see at 200 feet, and you would be unable to read this text). While the measure has many limitations and acuity does not equal resolution, it is a commonly used reference. Line resolution requirements (RCA, 1968): in television terminology, a line refers either to an actual scan line or to the time period allocated for a scan line. By this last definition, commercial broadcast TV in the United States is a 525-line system. Less than 525 actual scans are possible, however, because approximately 35 of the periods are used for the vertical retrace. Thus the number of actual or active TV lines is 490. Applying a Kell factor of 0.7 to this figure gives the equivalent of 343 active lines for use in considering resolution capabilities. (Because the phase relationships between a scanning spot and the objects in a natural scene cannot be controlled, some loss of resolution results. A commonly FIGURE 4-2 Visual acuity units. Source: Farrell and Booth, 1984. Reprinted by permission.

OCR for page 65
--> TABLE 4-4 Line Resolution Requirements Task Line Resolution Per Target Minimum Dimension Detection 1.0 ± 0.25 line pairs Orientation 1.4 ± 0.35 line pairs Recognition 4.0 ± 0.8 line pairs Identification 6.4 ± 1.5 line pairs used figure is 30 percent. Thus the number of lines for effectively calculating resolution is 70 percent of the total. This value is known as the Kell factor.) Furthermore, since an active TV line can represent, at most, one-half of a cycle of a periodic target (a light or a dark bar), at least two lines (a line pair also used as a measure in night vision goggles) are required to represent one cycle of a periodic target. It is important to keep this ratio of 2 active TV lines per cycle of spatial frequency in mind when dealing with line-scan systems (note that scan lines typically describe vertical resolution; horizontal resolution measures refer to pixels). This may be mitigated to some extent with helmet-mounted displays in which head movements can cause the sensor to move (i.e., scan line or pixel boundaries can be shifted) but data to demonstrate are not available, and for fixed displays (e.g., maps) the line pair/pixel pair requirement remains. Angular threshold of the eye (RCA, 1968): the probability of seeing an object is influenced not only by the field luminance, the contrast of the object with respect to the scene background and the complexity of the scene, but also by the angular subtense of that object at the eye of the observer. Whereas under ideal conditions the eye can resolve down to 30 seconds of arc, the common figure used is 1 minute of arc. In most practical situations, however, the angular threshold of the eye is higher. With a high resolution complex image, for which line resolution does not enter as a limiting and confounding factor, it appears that 6 to 12 minutes of arc are required for typical visual acquisition and recognition tasks. Table 4-4 summarizes conclusions from one set of measurements of the capability of humans to perceive single military targets (standing man to tank size) as a function of the limiting resolution per target minimum dimension (Johnson, 1960). Resolution example: an SVGA computer screen rated at 1,280 pixels x 1,024 lines, at a viewing distance that would result in a horizontal screen subtense of 10° horizontal by 7.5° vertical (with a 3/4 aspect ratio) could be defined as having resolution as follows:

OCR for page 65
--> horizontal: 1,280/10 = 128 pixels/degree of visual angle   or   128/60 = 2.13 pixels/arc minute of visual angle   or   1.07 pixel pairs/arc minute, which is approximately equivalent to 20/20 Snellen acuity. In the case of the Land Warrior System, a 40° horizontal by 30° ° vertical field of view is subtended by a (nominal) 640 x 480 pixel display. Taking a 0.7 Kell factor into account, however, active lines/pixels are actually 448 x 336, resulting in a resolution of 5.35 arcmin/pixel or a useable resolution of 10.7 arcmin/pixel. A Snellen equivalent measure for acuity would be 20/214.