Haptic interfaces are devices that enable manual interaction with virtual environments (VEs) or teleoperated remote systems. They are employed for tasks that are usually performed using hands in the real world, such as manual exploration and manipulation of objects. In general, they receive motor action commands from the human user and display appropriate tactual images to the human. Such haptic interactions may or may not be accompanied by the stimulation of other sensory modalities, such as vision and audition. Computer keyboards, mice, and trackballs constitute relatively simple haptic interfaces. Other examples of haptic interfaces available in the market are gloves and exoskeletons that track hand postures and joysticks that can reflect forces back to the user. Even more sophisticated devices have been built and implemented successfully in research laboratories. To realize the full promise of VEs and teleoperation, further development of haptic interfaces is critical. In pursuing this goal, many of the issues and technologies described in the sections on position tracking (Chapter 5) and telerobotics (Chapter 9) are relevant. To achieve success, a comprehensive research program is needed in human haptics, technology development, and interactions between the two.
In contrast to the purely sensory nature of vision and audition, only the haptic system can both sense and act on the environment. The human hand is a versatile organ that is able to press, grasp, squeeze, and stroke objects; it can explore object properties such as surface texture, shape, and softness; it can manipulate tools for repairing a watch and breaking concrete. In the words of Paul Valéry (1938), the hand is "a device which can,
in turn, strike, receive and give, feed, take an oath, beat a musical rhythm, read for the blind, speak for the mute, reach to a friend, stop a foe." Being able to touch, feel, and manipulate objects in an environment, in addition to seeing (and hearing) them, provides a sense of immersion in the environment that is otherwise not possible. It is quite likely that much greater immersion in a VE can be achieved by the synchronous operation of even a simple haptic interface with a visual and auditory display, than by large improvements in, say, the fidelity of the visual display alone. Real environments or VEs in which one is deprived of the touch and feel of objects seem impoverished, seriously handicap human interaction capabilities, and, at worst, can be disorienting.
Although haptic interfaces are typically designed to be operated by the user's hands, alternative designs suitable for the tactual and motor systems of other body segments are conceivable. However, not all interfaces that interact with the human mechano-sensorimotor systems are haptic interfaces. The distinction is based on the nature of the tasks for which the interface is used. For example, whole body motion interfaces (Chapter 6) concerned with conveying a sense of mobility to the user are not haptic interfaces in a strict sense.
STATUS OF THE RELEVANT HUMAN RESEARCH
The Human Haptic System
In order to develop cost-effective haptic interfaces, it is necessary to understand the roles played by the mechanical, sensory, motor, and cognitive subsystems of the human haptic system. The mechanical structure of the human hand consists of an intricate arrangement of 19 bones, connected by almost as many frictionless joints and covered by soft tissue and skin. Altogether, the bones are attached to about 20 intrinsic and extrinsic muscles through numerous tendons, which serve to activate 22 degrees of freedom (DOF) of the hand. The sensory system includes large numbers of various classes of receptors and nerve endings in the skin, joints, tendons, and muscles. Appropriate mechanical, thermal, and chemical stimuli activate these receptors, causing them to transmit electrical impulses via the afferent neural network to the central nervous system (of which the brain forms a part), which in turn sends commands through the efferent neurons to the muscles for desired motor action.
Haptic exploration and manipulation of solid objects covers a wide range of haptic functions yet provides a task framework within which the roles of the biomechanical, sensory, motor, and cognitive subsystems can be understood. Exploration is concerned mainly with the extraction of object properties, and it is therefore a sensory dominant task, although
well-controlled motor actions are necessary to obtain reliable information about the object. It consists primarily of discrimination or identification of surface properties (for example, shape and surface texture) and volumetric properties (for example, mass and compliance) of objects. Manipulation is concerned mainly with modification of the environment and thus it is a motor dominant task, although sensory feedback is essential for successful performance. Manipulation tasks can be grossly subdivided into precision tasks (for example, watch repair) and power tasks (for example, using a hammer).
In any task involving physical contact with an object, be it for exploration or manipulation, the surface and volumetric physical properties of the skin and subcutaneous tissues play important roles in its successful performance. For example, the finger pad, which is used by primates in almost all precision tasks, consists of hairless ridged skin (about 1 mm thick) that encloses soft tissues composed of mostly fat in a semiliquid state. As a block of material, the finger pad exhibits complex mechanical behavior—inhomogeneity, anisotropy, and rate and time dependence. The compliance and frictional properties of the skin, together with the sensory and motor capabilities of the hand, enable one to both glide over a surface without losing contact, to explore the shape of the surface, and to stably grasp a smooth object to manipulate it. The mechanical loading on the skin, the transmission of the mechanical signals through the skin, and their transduction by the cutaneous mechanoreceptors are all strongly dependent on the mechanical properties of the skin and subcutaneous tissues (Phillips and Johnson, 1981b; Srinivasan, 1989; Srinivasan and Dandekar, 1992).
Tactual sensory information from the hand in contact with an object can be divided into two classes: (1) tactile information, referring to the sense of contact with the object, mediated by the responses of low-threshold mechanoreceptors innervating the skin (say, the finger pad) within and around the contact region and (2) kinesthetic information, referring to the sense of position and motion of limbs along with the associated forces, conveyed by the sensory receptors in the skin around the joints, joint capsules, tendons, and muscles, together with neural signals derived from motor commands. (The term proprioceptive is used almost equivalently to kinesthetic by many authors.) For discussion of terminology see Darian-Smith (1984); Loomis and Lederman (1986). Only tactile information is conveyed when objects contact a passive, stationary hand, except for the ever-present kinesthetic information about the limb posture. Only kinesthetic information is conveyed during active, free (i.e., no contact with any object or other regions of skin) motion of the hand, although the absence of tactile information by itself conveys that the motion is free. Even when the two extreme cases just mentioned are included, it is clear
that all sensory and manipulatory tasks performed actively with the normal hand involve both classes of information. In addition, free nerve endings and specialized receptors that signal skin temperature, mechanical and thermal pain, and chemogenic pain and itch are also present (Sherrick and Cholewiak, 1986).
The control of contact conditions is often as important as sensing those conditions for successful task performance. In humans, such control action can range from a fast spinal reflex to a relatively slow conscious deliberate action. In experiments involving lifting of objects held in a pinch grasp, it has been shown that motor actions such as increasing grip force are initiated as rapidly as within 70 ms after an object begins to slip relative to the finger pad, and that the sensory signals from the cutaneous afferents are critical for task performance (Johansson and Westling, 1984; Johansson and Cole, 1992). Clearly, the mechanical properties of the skin and subcutaneous tissues, the rich sensory information provided by a wide variety of sensors that monitor the tasks continuously, and the coupling of this information with the actions of the motor system are responsible for the human abilities of grasping and manipulation. In the following three subsections we employ a systems viewpoint to briefly review the results on haptics in the psychophysics and neurophysiology literature.
Input-Output Variables of Haptic Interactions
Haptic interfaces in teleoperation or VE systems receive the intended motor action commands from the human and display tactual images to the human. The primary input-output variables of the interfaces are displacements and forces, including their spatial and temporal distributions. Haptic interfaces can therefore be viewed as generators of mechanical impedances that represent a relationship between forces and displacements (and their derivatives) over different locations and orientations on the skin surface at each instant of time. In contact tasks involving finite impedances, either displacement or force can be viewed as the control variable, and the other is a display variable, depending on the control algorithms employed. However, consistency among free hand motions and contact tasks is best achieved by viewing the time-varying geometrical configuration of the hand (for example, the vector of all joint angles and their derivatives with respect to time) as the control variable, and the resulting net force vector and its distribution within the contact regions as the display variables.
Because the human is sensing and controlling the position and force variables of the haptic interface, the performance specifications of the interface are directly dependent on human abilities. In a substantial number
of simple tasks involving active touch, one of the tactile and kinesthetic information classes is fundamental for discrimination or identification, whereas the other is supplementary. For example, in the discrimination of length of rigid objects held in a pinch grasp between the thumb and the forefinger (Durlach et al., 1989), kinesthetic information is fundamental, whereas tactile information is supplementary. In such tasks, sensing and control of variables such as fingertip displacements are crucial. In contrast, for the detection of surface texture or slip, tactile information is fundamental, whereas kinesthetic information is supplementary (Srinivasan et al., 1990). Here, the sensing of spatiotemporal force distribution within the contact region provides the basis for inferences concerning the contact conditions and object properties. Both classes of information are clearly necessary and equally important in more complex haptic tasks.
We now summarize briefly the psychophysical and neurophysiological results available on human haptic abilities in real environments at two levels: (1) sensing and control of interface variables and (2) perception of contact conditions and object properties. Although humans can feel heat, itch, pain, etc. through sensory nerve endings in the skin, we refrain from discussing these sensations here because the availability of practical interface devices employing them is unlikely in the near future.
Sensing and Control of Interface Variables
Limb Position and Motion
Our awareness of the relative positions and motions of our limbs arises from the kinesthetic sensory system, which consists of sensory receptors in the joint capsules, tendons, muscles, and skin around the joints, as well as the signals derived from motor commands during voluntary motion (see reviews by Clark and Horch, 1986; Matthews, 1982). The joint capsules are innervated by three different types of mechanoreceptive nerve terminals, namely, free nerve endings, Ruffini corpuscles, and Paciniform corpuscles (Darian-Smith, 1984), each of which have distinct response characteristics. In addition, the tendons contain Golgi organs, which seem to respond to tension, and the muscle spindles measure the muscle stretch and its rate of change. The skin around the joints contains four types of sensory endings (discussed below) in the hairless skin, in addition to receptors in the hair follicles in the hairy skin, with each receptor type coding different aspects of the mechanical loading imposed on the skin. Furthermore, the efferent copy (also referred to as corollary discharge) of the command signals generated to drive the muscles during voluntary movements gives information about the intended motor action
to the perceptual portions of the brain. Because of the presence of multiple, simultaneously active subchannels that are not individually accessible to experimentation, even basic questions about the functioning of the kinesthetic sensory system have not been answered unequivocally.
The source of kinesthetic information that enables us to know the relative positions of limb segments or joint angles is still controversial (Clark and Horch, 1986). Initially, it was proposed that the receptors in the joints were the source (Skoglund, 1956; Mountcastle and Powell, 1959). Later it was found from neurophysiological experiments that these receptors were activated only in the extremities of the range of joint rotations (Burgess and Clark, 1969; Grigg and Greenspan, 1977). Also, patients with artificial joints did not seem to lose their joint angle sense significantly (Grigg et al., 1973). It also should be noted that Ferrel (1980) has argued that joint afferent discharge is sufficient to help signal the joint angle over its full range, but does not claim that the afferents are exclusively responsible for position sense (Ferrel et al., 1987). Muscle spindles, which are believed to be muscle length detectors, have also been proposed as candidates that provide position sense (Matthews, 1982). Support for this hypothesis comes from the well-known haptic illusion that, when vibration is imposed on muscles and tendons, the corresponding limbs are perceived to be moving (Goodwin et al., 1972). However, because of cocontractions of agonist and antagonist muscles, the lengths of muscles may change without any change in the joint angle. Thus, computations involving all the muscular forces imposed on the joint are needed to extract the joint angle information from muscle spindles. Nevertheless, Matthews (1988) has proposed that it might be possible to recover angular velocity independent of position by combining the spindle signals with corollary discharges from motor centers. The third possible source of joint angle information is the stress and strain field in the skin surrounding the joint, which is directly related to the angle of rotation of the joint. Although this possibility has been mentioned in the literature, we are not aware of any systematic investigation of this hypothesis. Recently, Edin (1993) has shown that the strains produced in the skin can be large enough to signal the joint angle.
A large variety of psychophysical experiments have been conducted concerning the perception of limb position and motion (Clark and Horch, 1986; Jones and Hunter, 1992). It has been found that humans can detect joint rotations of a fraction of a degree performed over a time interval of the order of a second. The bandwidth of the kinesthetic sensing system has been estimated to be 20-30 Hz (Brooks, 1990). It is generally accepted that our sensitivity to rotations of proximal joints is higher than that of more distal joints. The just noticeable difference (JND) is about 2.5 deg for the finger joints, 2 deg for the wrist and elbow, and about 0.8 deg for the
shoulder (Tan et al., 1994). In locating a target position by pointing a finger, the speed, direction, and magnitude of movement, as well as the locus of the target, can all affect accuracy. In the discrimination of length of objects by the finger-span method (Durlach et al., 1989; Tan et al., 1992), the JND is about 1 mm for a reference length of 10 mm, and increases to 2-4 mm for a reference length of 80 mm, thus violating Weber's law (i.e., JND is not proportional to the reference length). In the kinesthetic space, psychophysical phenomena such as anisotropies in the perception of distance and orientation, apparent curvature of straight lines, non-Euclidean distance measures between two points, etc., have been reported (for a review, see Loomis and Lederman, 1986; Hogan et al., 1990; Fasse et al., 1990). Investigations of the human ability in controlling limb motions have typically measured human tracking performance with manipulanda having various mass, spring, and damping characteristics (Brooks, 1990; Poulton, 1974; Sheridan, 1992; Jones and Hunter, 1992). The differential thresholds for position and movement have been measured to be about 8 percent (Jones and Hunter, 1992). Human bandwidth for limb motions is found to be a function of the mode of operation: 1-2 Hz for unexpected signals, 2-5 Hz for periodic signals, up to 5 Hz for internally generated or learned trajectories, and about 10 Hz for reflex actions (reviews by Brooks, 1990). In summary, the sensing and control of limb position and motion are complex at all levels, ranging from psychophysical measures to the inner neurophysiological mechanisms.
Net Forces of Contact
When we contact or press objects through active motion of the hand, the contact forces are sensed by both the tactile and kinesthetic sensory systems. Overall contact force is probably the single most important variable that determines both the neural signals in the sensory system as well as the control of contact conditions through motor action. It appears that the JND for contact force is 5-15 percent of the reference force value over a wide range of conditions involving substantial variation in force magnitude, muscle system, and experimental method, provided that the kinesthetic sense is involved in the discrimination task (Jones, 1989; Pang et al., 1991; Tan et al., 1992). In closely related experiments exploring the human's ability to distinguish among objects of different weights, a slightly higher JND of about 10 percent has been observed (see reviews by Clark and Horch, 1986; Jones, 1986). An interesting illusion first observed in the late nineteenth century by Weber is that cold objects feel heavier than warm ones of equal weight (see review by Sherrick and Cholewiak, 1986). In experiments involving grasping and lifting of objects using a two-finger pinch grasp, Johansson and Westling (1984) have shown that
subjects have exquisite control over maintaining the proper ratio between grasping and lifting forces (i.e., the orientation of the contact force vector), so that the objects do not slip. However, when tactile information was blocked using local anesthesia, this ability deteriorated significantly because the subjects could not sense contact conditions such as the occurrence of slip and hence did not apply appropriate compensating grasp forces. Thus, good performance in tasks involving contact requires the sensing of appropriate forces as well as using them to control contact conditions. The maximum controllable force that can be exerted by a finger pad is about 100 N and the resolution in visually tracking constant forces is about 0.04 N or 1 percent, whichever is higher (Srinivasan and Chen, 1993; Tan et al., 1994).
Perception of Contact Conditions and Object Properties
Although humans experience a large variety of tactile sensations, these sensations are really combinations of a few building blocks or primitives. For simplicity, normal indentation, lateral skin stretch, relative tangential motion, and vibration are the primitives for conditions of contact with the object. Surface microtexture, shape (mm size), and compliance can be thought of as the primitives for the majority of object properties perceived by touch. The human perception of many of these primitives is through tactile information conveyed by cutaneous mechanoreceptors. The associated neural codes can be classified as intensive, temporal, spatial, and spatiotemporal (for a review, see Loomis and Lederman, 1986). We first describe neurophysiological findings of receptor response characteristics from experiments involving indentations with rounded probes, and then discuss results from experiments involving pressing and stroking of objects configured to emphasize specific geometric or material properties.
Monkeys are used as experimental models for physiological mechanisms in humans since the types of mechanoreceptors, their spacing in the skin, and the sensory capacities to detect and discriminate vibratory stimuli are similar for the two species. In monkey skin, on the finger pads and palm, mechanoreceptive afferents have been classified on the basis of their response properties to ramp and steady indentations of a probe with or without vibration (Knibestol and Vallbo, 1970; Mountcastle et al., 1972; Pubols, 1980; Pubols and Pubols, 1976, 1983; Talbot et al., 1968). They fall into three distinct classes. (1) Slowly adapting afferents (SAs), believed to originate from Merkel cells, respond both during ramp onset and steady indentation by a probe. When the probe is vibrated sinusoidally at the most sensitive spot on the skin, the SAs are tuned (i.e., one nerve impulse per stimulus cycle) at the lowest amplitudes (about 20 µ) when the frequencies
are low (less than 20 Hz). (2) Rapidly adapting afferents (RAs), emanating from Meissner corpuscles, respond to ramp onset but are quiet during steady indentation. Their tuning threshold amplitudes (about 5 µ) are the lowest in the middle frequency range (20-50 Hz). (3) Pacinian corpuscle fibers (PCs), while behaving in a similar manner to RAs for ramp and steady indentations, have very low tuning threshold amplitudes (about 1 µ) at high frequency ranges (100-300 Hz). Microneurographic techniques of recording single-nerve fiber responses from awake humans (Vallbo and Hagbarth, 1968; Knibestol and Vallbo, 1970; Johansson and Vallbo, 1979) have revealed another class of slowly adapting afferents that are primarily sensitive to skin stretch and are associated with Ruffini endings. The response properties, such as thresholds and bandwidths of each of the receptor types obtained through neurophysiological experiments, give some of the design specifications for tactile display part of haptic interfaces.
Considerable research effort has been invested on psychophysics of vibration perception and electrocutaneous stimulation using single or multiple probes (for a review, see Sherrick and Cholewiak, 1986). These studies are mostly directed at issues concerned with tactile communication aids for individuals who are blind, deaf, or deaf and blind, areas that are beyond the scope of this chapter. A comprehensive list of references describing such tactile displays can be found in Kaczmarek and Bach-y-Rita (1993) and Reed et al. (1982). In designing these devices, human perceptual abilities in both temporal and spatial domains are of interest. The human threshold for the detection of vibration of a single probe is about 28 dB (relative to 1 µ peak) for 0.4 to 3 Hz. It decreases at the rate of -5 dB/octave for 3 to 30 Hz, and decreases further at the rate of -12 dB/octave for 30 to about 250 Hz, after which the threshold increases for higher frequencies (Rabinowitz et al., 1987; Bolanowski et al., 1988). Spatial resolution on the finger pad, as measured by the localization threshold of a point stimulus, is about 0.15 mm (Loomis, 1979), whereas the two-point limen is about 1 mm (Johnson and Phillips, 1981).
To answer questions concerning perception and neural coding of roughness or spatial resolution, precisely shaped rigid surfaces consisting of mm-sized bar gratings (Lederman and Taylor, 1972; Morley et al., 1983; Phillips and Johnson, 1981a,b; Sathian et al., 1989), embossed letters (Phillips et al., 1983, 1988), or Braille dots (Lamb, 1983a,b; Darian-Smith et al., 1980) have been used in psychophysical and neurophysiological experiments (see review by Johnson and Hsiao, 1992). The perception of surface roughness of gratings is found to be solely due to the tactile sense and is dependent on the groove width, contact force, and temperature but not the scanning velocity (Loomis and Lederman, 1986). Spatial resolution on the finger pad, as measured by the localization threshold of a
point stimulus is about 0.15 mm (Loomis, 1979), whereas the two-point limen is about 1 mm (Johnson and Phillips, 1981).
Some of the salient results on the perception of slip, microtexture, shape, compliance, and viscosity are given below. Humans can detect the presence of a 2 µ high single dot on a smooth glass plate stroked on the skin, based on the responses of Meissner-type rapidly adapting fibers (RAs) (LaMotte and Whitehouse, 1986; Srinivasan et al., 1990). Moreover, humans can detect 0.06 µ high grating on the plate, owing to the response of Pacinian corpuscle fibers (LaMotte and Srinivasan, 1991). Among all the possible representations of the shapes of objects, the surface curvature distribution seems to be the most relevant for tactile sensing (Srinivasan and LaMotte, 1991; LaMotte and Srinivasan, 1993). Slowly adapting fibers respond to both the change and rate of change of curvature of the skin surface at the most sensitive spot in their receptive fields, whereas RAs respond only to the rate of change of curvature. Human discriminability of compliance of objects depends on whether the object has a deformable or rigid surface (Srinivasan and LaMotte, 1994). When the surface is deformable, the spatial pressure distribution within the contact region is dependent on object compliance, and hence information from cutaneous mechanoreceptors is sufficient for discrimination of subtle differences in compliance. When the surface is rigid, kinesthetic information is necessary for discrimination, and the discriminability is much poorer than that for objects with deformable surfaces. For deformable objects with rigid surfaces held in a pinch grasp, the JND for compliance is about 5-15 percent when the displacement range is fixed, increases to 22 percent when it is roved (varied randomly), and can be as high as 99 percent when cues arising out of mechanical work done are eliminated (Tan et al., 1992, 1993). Using a contralateral-limb matching procedure involving the forearm, Jones and Hunter (1992) have found that the differential thresholds for stiffness and viscosity are 23 and 34 percent, respectively. It has been found that a stiffness of at least 25 N/mm is needed for an object to be perceived as rigid by human observers (Tan et al., 1994).
In this section, we summarize the available quantitative research results on human haptics separately for tactile, kinesthetic, and motor systems, as well as results when all the three systems are involved under active touch conditions.
Tactile Sensory System
Humans can distinguish vibration sequences of up to 1 kHz through the tactile sense. The human threshold for the detection of vibration of a
single probe is about 28 dB (relative to 1 µ peak) for 0.4 to 3 Hz; it decreases at the rate of -5 dB/octave for 3 to 30 Hz, and decreases further at the rate of -12 dB/octave for 30 to about 250 Hz, after which the threshold increases for higher frequencies. Spatial resolution on the finger pad, as measured by the localization threshold of a point stimulus is about 0.15 mm, whereas the two point limen is about 1 mm. Human detection thresholds for features on a smooth glass plate are a 2 µ high single dot and a 0.06 µ high grating.
Kinesthetic Sensory System
Humans can detect joint rotations of a fraction of a degree performed over about a second. The bandwidth of the kinesthetic system is estimated to be 20-30 Hz. The JND is about 2.5 deg for the finger joints, 2 deg for the wrist and elbow, and about 0.8 deg for the shoulder.
Human bandwidth for limb motions is found to be a function of the mode of operation: 1-2 Hz for unexpected signals, 2-5 Hz for periodic signals, up to 5 Hz for internally generated or learned trajectories, and about 10 Hz for reflex actions. The differential thresholds for position and movement have been measured to be about 8 percent.
Active Touch Involving All Three Systems
The JND for length is about 1 mm for a reference length of 10 mm, and increases monotonically to 2.4 mm for a reference length of 80 mm. The JND for contact force is 5-15 percent of the reference force value. The maximum controllable force that can be exerted by a finger pad is about 100 N and the resolution in visually tracking constant forces is about 0.04 N or 1 percent, whichever is higher. The JND for compliance of deformable objects with rigid surfaces can range from 5 to 99 percent depending on the cues available to the human subject. A stiffness of at least 25 N/mm is needed for an object to be perceived as rigid by human observers. The differential threshold for viscosity sensed by activating the forearm is about 34 percent.
STATUS OF THE TECHNOLOGY
Terminology and Variables of Haptic Interfaces
Since haptic interfaces are devices composed of mechanical components in physical contact with the human body for exchange of information
with the human nervous system, it is natural to borrow the terms used in mechanics, human physiology, and robotics to describe the subsystems of the interfaces. In performing tasks with a haptic interface, the human user conveys desired motor actions by physically manipulating the interface, which, in turn, displays tactual sensory information to the user by appropriately stimulating his or her tactile and kinesthetic sensory systems. Thus, in general, haptic interfaces can be viewed as having two basic functions: (1) to measure the positions and contact forces (and time derivatives) of the user's hand (or other body parts) and (2) to display contact forces and positions (or their spatial and temporal distributions) to the user. Among these position (or kinematic) and contact force variables, the choice of which ones are the motor action variables (i.e., inputs to the computer) and which are the sensory display variables (i.e., inputs to the human) depends on the hardware and software design, as well as the tasks for which the interface is employed.
Although a force-reflecting haptic interface needs only to display forces, the sensing of forces by the interface (in addition to position sensing) is likely to be needed for several reasons. First, the presence of noise in the system, as well as the need to compensate for friction and inertia, requires closed-loop force control and hence force sensing. Second, the limitations on available VE technology make it necessary to achieve reconfigurability through changes in hardware as well as software (see below). In other words, a general-purpose VE system might need to augment the exoskeleton with a variety of hardware manipulanda, some of which would include force sensing. Third, in certain applications, it may be desirable to create nonnatural environments. For example, in certain cases it might be appropriate to use a fixed-position, force-sensing joystick together with a visual display of tactile information. Alternatively, one might find it helpful to employ a position-displaying joystick, with or without force sensing, to present certain kinds of spatial information (e.g., for guiding a passive hand through a maze).
Classification of Haptic Interfaces
A primary classification of haptic interactions with real environments or VEs that affects interface design can be summarized as follows: (1) free motion, in which no physical contact is made with objects in the environment; (2) contact involving unbalanced resultant forces, such as pressing an object with a finger pad; (3) contact involving self-equilibrating forces, such as squeezing an object in a pinch grasp. Depending on the tasks for which a haptic interface is designed, some or all of these elements will have to be adequately simulated by the interface. For example, grasping and moving an object from one location to another involves all three
elements. The design constraints of a haptic interface are strongly dependent on which of these elements it needs to simulate. Consequently, the interfaces can be classified according to whether they are force-reflecting or not, as well as by what types of motions (e.g., how many degrees of freedom) and contact forces they are capable of simulating.
An alternative but important distinction in our haptic interactions with real environments or VEs is whether we touch, feel, and manipulate the objects directly or with a tool. The complexity in the design of a haptic interface is seriously affected by which of these two types of interactions it is supposed to simulate. Note that an ideal interface, designed to provide realistic simulation of direct haptic exploration and manipulation of objects, would be able to simulate handling with a tool as well. Such an interface would measure the position and posture of the user's hand, display forces to the hand, and make use of a single hardware configuration (e.g., an exoskeleton with force and tactile feedback) that could be adapted to different tasks by changes in software alone. For example, the act of grasping a hammer would be simulated by monitoring the position and posture of the hand and exerting the appropriate forces on the fingers and palm when the fingers and palm were in the appropriate position. However, the large number of degrees of freedom of the hand, extreme sensitivities of cutaneous receptors, together with the presence of mass, friction, and limitations of sensors and actuators in the interface, make such an ideal impossible to achieve with current technology. In contrast, an interface in the form of a tool handle, for which reconfigurability within a limited task domain is achieved through both hardware and software changes, is quite feasible. Thus, one of the basic distinctions among haptic interfaces is whether they attempt to approximate the ideal exoskeleton or employ the tool-handle approach.
Another set of important distinctions concerning haptic interfaces results from a consideration of the force display subsystems in an interface. Broadly speaking, force display systems can be classified as either ground-based, such as joysticks and other hand controllers, or body-based, such as gloves and exoskeletons. Frequently, the distinction between grounding sites is overlooked in the literature. For example, exploration or manipulation of a virtual object requires that force vectors be imposed on the user at multiple regions of contact with the object. Consequently, equal and opposite reaction forces are imposed on the interface. If these forces are self-equilibrating, as in simulating the contact forces that occur when we squeeze an object, then the interface need not be mechanically grounded. However, if the forces are unbalanced, as in pressing a virtual object with a single finger pad, the equilibrium of the interface requires that it be attached somewhere. A force-reflecting joystick attached to the floor would be a ground-based display, whereas a force-reflecting exoskeletal
device attached to the user's forearm would be a body-based display. The grounding choice affects whether the user experiences throughout his or her entire body the stresses induced by contact with a virtual object. The consequences of using a body-based display to simulate contact forces that really stem from ground-based sources are not known and warrant investigation. A further example of improperly grounded displays occurs with most tactile stimulators. If a tactile stimulator array is attached to the finger pad via a strap surrounding the finger, then the net applied force by the stimulator is balanced by a reaction force on the back of the finger. Whether this reaction force can be distributed with a low enough pressure distribution to be imperceptible, and whether the absence of stresses throughout the rest of the musculoskeletal system is inconsequential, are not known. Although most devices built to date are either ground-based or body-based, hybrid interfaces that are a combination of the two (such as the Dextrous Teleoperation System Master built by Sarcos, Inc.) are also possible.
Haptic interface hardware for synthetic environments (SEs) is in the very early stages of development. Many of the devices available today have been motivated by needs predating those of VE technology. Simple position/motion-measuring systems have long been employed to provide control inputs to the computer. These have taken many forms, such as those that involve contact with the user without controlled force display (e.g., keyboards, computer mice, trackballs, joysticks, passive exoskeletal devices) and those that measure position/motion without contact (e.g., optical and electromagnetic tracking devices). Applications motivating development of these devices have ranged from the control of equipment (e.g., instruments, vehicles) to biomechanical study of human motion (e.g., gait analysis, time and motion studies). The requirements for position trackers and a variety of design approaches and devices are described in Chapter 6 on position tracking and mapping.
The early developments in force-displaying haptic interfaces were driven by the needs of the nuclear energy industry and others for remote manipulation of materials (Sheridan, 1992). The force-reflecting teleoperator master arms in these applications were designed to communicate to the operator information about physically real tasks. The recognition of the need for good-quality force displays by early researchers (Goertz, 1964; Hill, 1979) continues to be relevant to today's VE applications. Although Sutherland's (1965) pioneering description of VEs included
force-reflecting interfaces, development of practical devices has proven to be difficult. The current state of kinematics, actuators, sensors, and control of master manipulators described in Chapter 9 on telerobotics is directly relevant to haptic interfaces.
A rough breakdown of major types of haptic interfaces that are currently available or being developed in laboratories and companies around the world is as follows:
flexible (gloves and suits worn by user)
rigid links (jointed linkages affixed to user)
shape memory actuators
Joysticks are probably the oldest of these technologies and were originally conceived to control aircraft. Even the earliest of control sticks, connected by mechanical wires to the flight surfaces of the aircraft, unwittingly presented force information about loads on the flight surfaces to the pilot. In general, they may be passive (not force reflecting), as in the joysticks used for cursor positioning, or active (force reflecting), as in many of today's modern flight-control sticks. For example, Measurement Systems Inc. has marketed several 2- and 3-DOF position-sensing joysticks, some of which can sense but not display force. Examples of force-reflecting 2-DOF joysticks designed for relatively high bandwidth are the AT&T mini-joystick (Schmult and Jebens, 1993) and one built in the MIT Newman Laboratory (Adelstein and Rosen, 1992).
Many of the force-reflecting hand controllers available today have been developed for the control of remote manipulators (Jacobus et al., 1992; Meyer et al., 1992). Generally, these devices employ at most 6 DOF (plus grip control) and have a wide range of performance qualities. Particularly good reviews of performance characteristics are found in Brooks (1990) and McAffee and Fiorini (1991), and a broad overview of the devices is available in Honeywell (1989). A great deal of work concerning ergonometrics (shape, switch placement, motion and force characteristics, etc.) has gone into the design of the hand grip of these devices (Brooks
and Bejczy, 1985). One of the first applications of force-reflecting hand controllers to VEs was in project GROPE at the University of North Carolina (Brooks et al., 1990). The Argonne Mechanical Arm (ARM) was used successfully for force reflection during interactions with either simulations of molecule docking or with data from a scanning tunneling microscope. Recently, high-performance devices have been specifically designed for interaction with VEs. The MIT Sandpaper is a 3-DOF joystick that is capable of displaying virtual textures (Minsky et al., 1990). In Japan, desktop master manipulators have been built in Tsukuba (Iwata, 1990; Noma and Iwata, 1993). At the University of British Columbia, high-performance hand controllers have been developed by taking advantage of magnetic levitation technology (Salcudean et al., 1992). PER-Force is a 6-DOF hand controller that delivers high performance (Cybernet Systems, 1992). The PHANTOM, built in the MIT Artificial Intelligence Laboratory, is a multilink, low-inertia device that can convey the feel of virtual objects (Massie and Salisbury, 1994).
Sophisticated teleoperation masters have been built that can be used to feel and manipulate virtual objects as well. At harvard, Howe (1992) has developed a teleoperation system with a two-finger master that can be used to execute precision tasks with a pinch grasp between the thumb and the index finger. One of the most complex force-reflecting devices built to date is the Dextrous Teleoperation System Master designed by Sarcos, Inc., in conjunction with the University of Utah's Center for Engineering Design and the Naval Ocean Systems Center (NOSC). Although it is primarily ground-based, by having attachment points at the forearm and upper arm of the user it has the advantages of an exoskeleton, such as a large workspace comparable to that of the human arm. This device utilizes high-performance hydraulic actuators to provide a wide dynamic range of force exertion at relatively high bandwidth on a joint-by-joint basis for 7 DOF. Another high-performance force-reflecting master is a ground-based system built by Hunter et al. (1990) to enable two-handed teleoperation of a microrobot that can meet the dual requirements of wide bandwidth (exceeding 1 kHz) and high accuracy (as low as a few nanometers). Improved versions of these devices have been built for teleoperated eye surgery and represent the state-of-the-art performance that can be achieved using currently available technology (Hunter et al., 1994).
Exoskeletal devices are characterized by the fact that they are designed to fit over and move with the limbs or fingers of the user. Because they are kinematically similar to the arm and hands that they monitor and stimulate, they have the advantage of the widest range of unrestricted user motion. As position-measuring systems, exoskeletal devices (gloves, suits, etc.) are relatively inexpensive and comfortable to
use. The well-known VPL DataGlove and DataSuit use fiberoptic sensors to achieve a joint angle resolution of about a degree. The Virtex CyberGlove achieves a higher resolution of about half a degree by using strain gauges. EXOS and the Utah/MIT Dextrous Hand Master consist of rigid link exoskeletons and use Hall effect sensors to obtain a resolution of about 0.2 to 0.5 deg. Rigid link exoskeletons that provide force reflection in addition to joint angle sensing have also been designed and built. Shimoga (1992) provides an excellent review of these devices and design issues, including both human factors and technology. The Utah hand-wrist master (Jacobsen et al., 1989), the Rutgers Portable Dextrous Master (Burdea et al., 1992), the JPL Glove controller (Jau, 1992), the Tsukuba fingertip force display (Iwata et al., 1992), and the EXOS SAFIRE fall into this category of device. However, providing high-quality force feedback with such devices that is commensurate with human resolution is difficult and places great demands on actuator size minimization and control bandwidth.
While the display of net forces is appropriate for coarse object interaction, investigators have also recognized the need for more detailed displays within the regions of contact. In particular, the display of tactile information (e.g., force distributions for conveying information on texture and slip), though technically difficult, has long been considered desirable for remote manipulation (Bliss and Hill, 1971). Tactile display systems in the last two decades have been mostly used in conveying visual and auditory information to deaf and blind individuals (Bach-y-Rita, 1982; Reed et al., 1982). Display systems that attempt to convey information about contact use a variety of techniques. Shape-changing displays convey the local shape of contact by controlling the deformation or forces distributed on the skin. This has been accomplished by an array of stimulators actuated by DC solenoids (Frisken-Gibson et al., 1987), shape memory alloys (TiNi, 1990), and compressed air. The use of a continuous surface actuated by a electrorheological fluid has been proposed by Monkman (1992). Vibrotactile displays deliver mechanical energy through an array of vibrating pins placed against the skin. The Opticon, marketed by Telesensory Systems, and the Bagej Corporation tactile stimulator belong to this class. The EXOS touch master consists of a single voice coil vibrator. A particularly promising desktop tactile array capable of high performance as both a shape changer and a vibrator over 0 to several hundred Hz is being developed at Johns Hopkins University (Schneider, 1988). Electrotactile displays stimulate the skin through surface electrodes. A review of principles and technical issues in vibrotactile and electrotactile displays can be found in Kaczmarek and Bach-y-Rita (1993). Various types of tactile display devices mentioned above are reviewed by Shimoga (1992).
In general, haptic interfaces receive motor action commands from the human user and display appropriate tactual ''images" to the user. Tactual images consist of force and displacement fields to be imposed on the observer in order to simulate the observer's desired mechanical interactions with objects in the VE. In general, these images stimulate both tactile and kinesthetic information channels in the observer and are driven by the actions of the observer. Major components of the information conveyed are the mode of contact with the objects (e.g., indentation, slip), mechanical properties of the objects (e.g., texture, shape, compliance), as well as the motions and forces involved in exploration and manipulation in a VE.
Since haptic interfaces for interacting with VEs are in the early stages of development, there is very little software that has been specifically designed for generating tactual images. Commercially developed codes necessary for using position trackers of various manufacturers are available. However, for force-reflecting devices, as in the case of hardware, most of the software has been developed in the context of teleoperation or controlling autonomous robots. Several research laboratories have developed VE systems with visual and haptic displays achieved through appropriate integration of mechanistic models of virtual objects and control of haptic interfaces for rendering tactual images with software used to drive the visual images. For example, the PHANTOM interface developed in the MIT Artificial Intelligence Laboratory has been used to tactually display the forces of contact of a stylus held in the user's hand with a variety of static and dynamic virtual objects in synchrony with visual images of the objects and their motion.
Similar to the software needed to generate visual images (Chapter 8), the software necessary to generate tactual images can be classified into three major groups: haptic interaction software, physical models of virtual objects and environments, and software for rendering tactual images. Haptic interaction software mainly consists of reading the state of the haptic interface device. For example, the signal conditioning and noise reduction software necessary for reading position or force sensors would fall within this category. In the case of exoskeletal devices used for tracking hand posture, a higher-level software based on the human kinematic model of the hand is needed as well for interpreting the sensor signals as corresponding to a hand posture.
Physical models of virtual objects and environments receive user's commands through the sensors in the haptic interface and generate force or displacement outputs corresponding to the physical behavior of a simulated object in the VE. As mentioned in the section on world modeling in
Chapter 8, this can either be accomplished by a unified model for all the modalities (visual, haptic, acoustic) or through separate models for each modality together with correlation algorithms for consistency among the displays corresponding to each of the modalities. However, the computations needed for the former approach (e.g., involving finite element methods) tend to be extremely intensive and are difficult to complete in real time, even when one uses supercomputers. Simplifications in generating multimodal images are necessary, not only because of the computational difficulties, but also because the display devices at present have limited capabilities. Therefore, even though the physics governing the visual, haptic, and acoustic behavior of an object is the same, different approximations might be needed for each of the modalities. For example, visual images are scalar, two-dimensional projections of the objects, whereas tactual images are, in general, three-dimensional vector fields. For realistic visual images, all the objects within the visual field need to be displayed and, typically, each object needs to appear as a continuous two-dimensional projection. In the case of tactual images, often only the display of forces within isolated contact regions is sufficient. Also, lumped-parameter models that approximate a continuum through discrete elements may be good enough to generate inputs to the haptic rendering devices. However, these force fields are tightly coupled to the user's actions as well as the mechanical properties of the soft human tissues in contact with the interface device. The mechanics of interaction between the observer and the environment plays a fundamental role in the generation of tactual images. Models of the human operator's behavior and performance developed in teleoperation literature are applicable to VEs as well (Sheridan, 1992).
The software for rendering the tactual images receives the output of the physical model and generates the commands needed to drive the interface device. In the case of the Sandpaper, a 2-DOF joystick capable of force display (Minsky et al., 1990), the authors report success in conveying the feel of exploring rough surfaces by using a simple rule that contact forces to be displayed are proportional to the local gradient of the textured surface. Even when such simple algorithms generate the tactile images, if the user has in addition visual or auditory inputs that are consistent, it is possible that the interactions with VEs will seem sufficiently realistic to him or her. Therefore, the algorithms for the generation of tactual images depend strongly on the particular application as well as the capabilities of the display device, including the available computational speed. Because force displays are prone to mechanical instabilities and human users are sensitive to even low disturbances unrelated to the task, real-time control of the interface devices needs to be of high quality. In the robotics and teleoperation literature (Chapter 9), considerable effort
has been directed at implementing conventional proportional-integral-derivative (PID) controllers for contact tasks. Impedance control techniques (reviewed by Brooks, 1990) and the use of the passivity principle have been reported to be successful in combatting instabilities. Substantial theoretical research is currently being pursued in the areas of multivariable control and advanced nonlinear techniques, such as adaptive and robust control.
Summary of Current Technology and Future Possibilities
Computer keyboards, mice, and trackballs are the simplest haptic interfaces and are being widely used to interact with computers. Position-sensing gloves and exoskeletons without force reflection are also available on the market but are used mainly for research purposes. Among the force-reflecting devices, ground-based devices such as joysticks are being used, and modified versions of such devices for different tool handles are feasible in the near future. Force-reflecting exoskeletons are harder to design for adequate performance, and only a few such have been built for research purposes. Tactile displays offer particularly difficult design challenges because of the high density of receptors in the skin to which they must apply the stimulus. There exist a number of examples of tactile stimulators for the finger, including pneumatic shape changers, electrocutaneous stimulators, and vibrating arrays, but none provides convincing tactile images and all are awkward to use (Durlach et al., 1992).
The emerging field of microelectromechanical systems (MEMS) holds promise for providing very fine arrays of tactile stimulators. Arrays of surface-normal, electrostatic actuators currently being developed for sensors could be adapted for use in high-resolution tactile displays (Trimmer et al., 1987). Although capable of relatively small forces and deflections, arrays of such actuators integrated with addressing electronics would be inexpensive, lightweight, and compact enough to be worn without significantly impeding hand movement or function. In addition, the current technology makes feasible a 20 × 20 array of individually controlled stimulators on a 1 cm × 1 cm chip. Finally, recent work on thin-film, shape-memory alloys would enhance the attractiveness of shape-changing displays by increasing stimulator densities and actuation bandwidths. It should be noted that with synchronized multimodal stimulation, such as for simulating the contact between a tool and a rigid object, more realism can probably be achieved by providing an audible "ping" together with low bandwidth force feedback, than by improving the force bandwidth to the maximum value that is possible with current technology. Because of the difficulties in developing good cutaneous stimulator devices, initial
efforts on haptic displays should probably focus on devices that apply net forces on the hand or fingertips (the tool-handle approach discussed above). Even with this simplification, large improvements on existing devices can be achieved only by a proper match between the performance of the device and human haptic abilities.
Due to inherent hardware limitations, haptic interfaces can deliver only stimuli that approximate our interactions with the real environment. It does not, however, follow that synthesized haptic experiences created through the haptic interfaces necessarily feel unreal to the user. Consider an analogy with the synthesized visual experiences obtained while watching television or playing a video game. Whereas visual stimuli in the real world are continuous in space and time, these visual interfaces project images at the rate of about 30 frames/s. Yet we experience a sense of realism and even a sense of telepresence because we are able to exploit the limitations of the human visual apparatus. The hope that the necessary approximations in generating synthesized haptic experiences will be adequate for a particular task is based on the fact that the human haptic system has limitations that can be similarly exploited. To determine the nature of these approximations or, in other words, to find out what we can get away with in creating synthetic haptic experiences, quantitative human studies are essential. Basic understanding of the biomechanical, sensorimotor, and cognitive abilities of the human haptic system is critical for proper design specification of the hardware and software of haptic interfaces. In addition, all mechanical devices will have their own intrinsic properties (such as friction, mass, compliance, viscosity, time delay) that will necessarily be interposed between the user and the desired stimulation. This lack of perfect transparency will always be present to some degree and will thus make all stimulators less than ideal. Given the approximate nature of synthetic haptic stimulation, it is clear that there is a need to assess which types of stimulation provide the most useful and profound haptic cues for the task at hand.
Compared with the visual and auditory domains, the capabilities of haptic devices and our understanding of human haptics are quite limited. A comprehensive program to develop a variety of haptic interfaces for VEs and teleoperation needs to include research in three major areas: (1) human haptics, (2) technology development, and (3) matching the performance of humans and haptic devices. It does not mean, however, that such research has to precede any usage of haptic devices. For applications that are simple from a haptic standpoint, such as those requiring relatively low-resolution hand position information, joysticks and gloves
currently available off the shelf can be sufficient. More complex applications involving force and tactile displays might need research in some or all of the areas mentioned above. Since progress in the three areas is interdependent, the desirable course of development for a challenging application is to continually build improved versions of haptic devices based on experimental data obtained from the previous versions on the performance of humans, and devices and the interaction between the two. Due to the availability of powerful computers and high-precision mechanical sensors and actuators, it is now possible to exert control over experimental variables as never before.
As mentioned above, the biomechanical, sensorimotor, and cognitive abilities of humans set the design specifications for devices. Therefore, multidisciplinary studies involving biomechanical and psychophysical experiments together with computational models for both are needed in order to have a solid scientific basis for device design. Perhaps to a lesser extent, neurophysiological studies concerning peripheral and central neural representations and the processing of information in the human haptic system will also aid in design decisions concerning the kinds of information that need to be generated and how these should be displayed. A major barrier to progress from the perspectives of biomechanics, psychophysics, and neuroscience has been the lack of robotic stimulators capable of delivering a large variety of stimuli under sufficiently precise motion and force control.
The tight mechanical coupling between the human skin and haptic interfaces strongly influences the effectiveness of the interface. Therefore, the specifications for the design of sensors and actuators in the interface, as well as the control algorithms that drive the interface, require the determination of surface and bulk properties of, say, the finger pad. The measurement of force distributions within the contact regions with real objects is needed to determine how a display should be driven to simulate such contacts in VEs. In addition, computational models of the mechanical behavior of soft tissues will aid in simulating the dynamics of task performance for testing control algorithms, as well as in determining the required task-specific force distributions for the displays. This requires measurement of the in vivo skin and subcutaneous soft tissue response to time-varying normal and tangential loads. Information on such human factors as the size, shape, degrees of freedom, and ranges of motion of the fingers, hand, and arm are generally available in handbooks.
Determination of the basic sensorimotor and cognitive abilities of the human haptic system needed for developing haptic interfaces can be subdivided as follows:
Sensing and control of contact forces and joint angles or end-point displacements: Even simple questions concerning our abilities (such as what is the resolution, range, and bandwidth in the sensing and control of interface variables) or mechanisms (such as how we perceive joint angles or contact forces) do not yet have unequivocal answers.
Perception of contact conditions and object properties: The important connection between the loads imposed on the skin surface within the regions of contact with objects and the corresponding perception has only begun to be addressed. Psychophysical experiments directed at determining the primary cues that signal various object properties need to be undertaken.
Integration of local contact information with nonlocal perception of the environment: Tactual perception typically provides local information about an object. To be effective in training tasks, such as cockpit familiarization, that information must be integrated into nonlocal perception of the space within which the hand and arm move. However, haptic perception of mechanical quantities has been found to be significantly distorted (Fasse, 1992; Hogan et al., 1990). The relationship between these haptic distortions and human internal perceptual models of space and the objects in it is unknown. The influence of these distorted perceptions on production of motor behavior has barely been addressed. The theoretical framework to generate testable hypotheses must be built on a fundamental understanding of the relations between haptic perception of geometric and mechanical quantities, such as magnitudes and orientations of lengths, forces, and stiffnesses. Experimentally verified models of the relationship between haptic perceptions and motor actions are critical for the design of effective synthetic haptic environments. Similar studies need to be performed under multimodal conditions as well.
Performance in the presence of inherent time delays, distortions, and noise: These experiments are needed for all modalities individually and in combination. Studies directed at sensorimotor and cognitive adaptation and training effects are needed.
Theoretical developments concerning information flow: Theoretical developments concerning the task-specific flow of sensory information and control of motor action are needed to generate testable hypotheses on our haptic interactions with both real environments and VEs. Development of improved models of human operator behavior and performance (available in the teleoperation literature) through tests in realistic tasks would be beneficial in both the design and operation of SE systems.
Four areas of hardware development are of interest: (1) finger, hand and arm position/joint angle measurement (trackers); (2) displays of forces and torques; (3) tactile displays; and (4) other stimulus distributions applied as two-dimensional fields to the skin, such as thermal stimuli.
The major problems with the position/angle-measuring devices are the intrusion the user feels while wearing, say, an exoskeleton, and the ever-present need for improvements in ranges, resolutions, and bandwidths. In order to display forces, designs with good actuation and control need to be developed such that they have sufficient force range, resolution, smoothness, and bandwidth. Attention needs to be paid to the friction, backlash, mechanical stiffness, apparent mass, inertia, and natural frequencies of the devices. High position resolution is needed to minimize the effect of quantization errors on stability of contact interaction. Force feedback systems need to have vibration rigorously controlled to prevent false cues to the human user. In order to achieve such high performance without mechanical instabilities, robust and adaptive closed-loop control of the devices is necessary. The mechanics of the devices must be intrinsically correct so that the difficult problems of compensating for the mass and inertia of the control arm are avoided or minimized.
Although many of the design specifications for haptic interfaces are task-dependent, we can estimate some of the interface performance requirements based on human haptic abilities. For example, since the human finger joint angle JND is of the order of a deg, the fingertip position resolution is about 1 mm. For the haptic interface to perform well, its fingertip position display resolution should probably be about 0.1 mm, and the bandwidth should be about 30 Hz to match the estimated human kinesthetic bandwidth. The maximum stiffness of the actuators should be in excess of 25 N/mm to have realistic simulation of contact with rigid stationary objects. To fully match human haptic sensory capabilities, the tactile or force displays should have a bandwidth of about 1 kHz, whereas the signals representing the human motor action need to have a bandwidth of only 10 Hz. In order to prevent false cues to the user, vibrations that are not part of the intended display should have amplitudes less than human detection threshold, which is about 25 m at 0.4 to 3 Hz, 3 m at 30 Hz, 0.3 m at 250 Hz, and is higher for higher frequencies. For tactile displays, the spatial density of actuating elements should be at least 1 mm/taxel to match the human tactile resolution. To realistically simulate continuous surfaces of virtual objects, the actuating arrays need to be
even more densely packed, or should have a continuous surface over them, because of the high sensitivity of the tactile sensory system to point loads and sharp edges. It should be noted that when visual and/or auditory senses are also stimulated, haptic interfaces with lower performance capabilities than the above estimates may be adequate.
Exploration of novel technologies is needed for quantum improvements in rotary and linear actuators. Use of shape memory alloys (SMAs) and microelectromechanical systems (MEMS) for tactile displays also needs to be investigated further. It has been estimated that real-time mechanical interactions with typical finite element models need computational speeds on the order of Gflops (Hunter et al., 1990). Similar to graphics engines used commonly with visual displays, special computational hardware specifically designed to accelerate the computations needed for haptic displays will become necessary in the near future.
Modeling of the haptic environment and control of real-time interactions together with synchronous operation of other sensory modalities is a major need in software development that requires substantial research. What needs to be modeled and how to interact and display is task dependent. Trade-offs in precision and computational speed are critical. Standard methods for easily implementing physical models that range from high fidelity to coarse approximations need to be developed. In addition, models of the human operator, the environment, and interaction dynamics available in teleoperation literature need to be adapted and improved for VE applications.
Simulation of multibody environments will be possible only if we address computational efficiency and appropriate architectures for modeling and maintaining a mechanical world. It is likely that this problem is much harder than simple graphic simulations. Some parallels exist, like texture, collision detection, and simulation of object dynamics, but to feel right a world model for haptic display must possibly run substantially faster, at least at the points of contact between the user and the synthetic environment. Real-time control algorithms are available to render the calculated outputs of the models to the human user through tactual displays. However, in order for the displays to be robust and feel right, the control bandwidths need to achieve frequencies of the order of several kHz. Efficient methods of implementing the control software need to be developed, including the use of special hardware, such as transputers connected in parallel. Also, theoretical advances in multivariable control and advanced nonlinear techniques, such as adaptive and robust control, are needed.
Matching Performance of Humans and Haptic Devices
Making the human user comfortable when wearing or interacting with haptic interfaces is of paramount importance, since pain, or even discomfort, supersedes all other sensations. Appropriate attachment methods for ground-based and body-based haptic interfaces need to be developed. Design principles of achieving kinematics and the dynamics of devices that impose minimal constraints or bias on operator's hand/arm operation need to be explored.
Methods of Stimulation
The right balance of complexity and performance in system capabilities is generally task dependent. In particular, the fidelity with which the tactual images have to be displayed and the motor actions have to be sensed by the interface depends on the task, stimulation of other sensory modalities, and interaction between the modalities. Experimenting with the available haptic interfaces, in conjunction with visual and auditory interfaces, is necessary to identify the needed design improvements. Design compromises and tricks for achieving the required task performance capabilities or telepresence (immersion) need to be investigated. One of the tricks might be the use of illusions (such as visual dominance) to fool the human user into believing a less than perfect multimodal display. Techniques such as filtering the user's normal tremor or the use of sensory substitution within a modality (e.g., the use of tactile display to convey kinesthetic information) or among different modalities (e.g., visual display of a force) need to be developed to overcome the limitations of the devices and the limitations of the human user, perhaps to achieve supernormal performance. To tackle the ever-present time delays, efficient and reliable techniques for running model-based and real-time controls concurrently are needed.
Evaluation of Haptic Interfaces
Evaluation of haptic interfaces is crucial to judge their effectiveness and to isolate aspects that need improvement. However, such evaluations performed in the context of teleoperation have been so task-specific that it has been impossible to derive useful generalizations and to form effective theoretical models based on these generalizations. There is a strong need to specify a set of elementary manual tasks (basis tasks) that can be used to evaluate and compare the manual capabilities of a given
system (human, robotic, VE) efficiently. Ideally, this set of basis tasks should be such that (1) knowledge of performance on these tasks enables one to predict performance on all tasks of interest and (2) it is the minimal set of tasks (in terms of time consumed to measure performance on all tasks in the set) that has this predictive power.
Two basic psychophysical questions in evaluation are: (1) With a given set-up, how good is the task performance or realism of the subjective experience? (2) How does a change in the set-up improve the performance of a given task, realism of the experience, or both? An example of the former is the investigation of the consequences of using an ungrounded display to simulate contact forces that really stem from grounded sources. In the latter question, the word change is to be interpreted in a broad sense and includes modifications of the interface hardware, object models, interaction software, and addition/subtraction of visual or auditory modalities. Theoretical and experimental approaches to quantify information transfer rates to and from the user under various single and multimodal conditions need to be developed.