This chapter reviews issues and needs in telerobotics. A telerobot is defined for our purposes as a robot controlled at a distance by a human operator, regardless of the degree of robot autonomy. Sheridan (1992c) makes a finer distinction, which depends on whether all robot movements are continuously controlled by the operator (manually controlled teleoperator), or whether the robot has partial autonomy (telerobot and supervisory control). By this definition, the human interface to a telerobot is distinct and not part of the telerobot. Haptic interfaces that mechanically link a human to a telerobot nevertheless share similar issues in mechanical design and control, and the technology survey presented here includes haptic interface development.
Telerobotic devices are typically developed for situations or environments that are too dangerous, uncomfortable, limiting, repetitive, or costly for humans to perform. Some applications are listed below:
Underwater: inspection, maintenance, construction, mining, exploration, search and recovery, science, surveying.
Space: assembly, maintenance, exploration, manufacturing, science.
Resource industry: forestry, farming, mining, power line maintenance.
Process control plants: nuclear, chemical, etc., involving operation, maintenance, decommissioning, emergency.
Military: operations in the air, undersea, and on land.
Medical: patient transport, disability aids, surgery, monitoring, remote treatment.
Construction: earth moving, building construction, building and structure inspection, cleaning and maintenance.
Civil security: protection and security, firefighting, police work, bomb disposal.
This chapter is divided into five sections, which represent one way of categorizing past developments in telerobotics:
A recent survey including these and other topics is provided by Sheridan (1992a).
Relation to Robotics
Telerobots may be remotely controlled manipulators or vehicles. The distinction between robots and telerobots is fuzzy and a matter of degree. Although the hardware is the same or is similar, robots require less human involvement for instruction and guidance than do telerobots. There is a continuum of human involvement, from direct control of every aspect of motion, to shared or traded control, to nearly complete robot autonomy.
Any robot manipulator can be hooked up to a haptic interface and hence become a telerobot. Similarly, any vehicle can be turned into a teleoperated mobile robot. There are many examples in the literature of different industrial robots that have been used as telerobots, even though that was not the original intended use. For example, a common laboratory robot, the PUMA 560, has frequently been teleoperated (Funda et al., 1992; Hayati et al., 1990; Kan et al., 1990; Lee et al., 1990; Salcudean et al., 1992). There have also been a number of telerobots specifically designed as such, often with a preferred haptic interface. The design issues for robots, telerobots, and haptic interfaces are essentially the same (although Pennington, 1986, seeks to identify differences). Often telerobots have to be designed for hazardous environments, which require special characteristics in the design. Industrial robots have most often been designed for benign indoor environments.
Why don't we do everything with robots, rather than involve humans in telerobotic control? We can't, because robots are not that capable. Often there is no substitute for human cognitive capabilities for planning
and human sensorimotor capabilities for control, especially for unstructured environments. In telerobotics, these human capabilities are imposed on the robot device. The field of robotics is not that old (35 years), and the task of duplicating (let alone improving upon) human abilities has proven to be an extremely difficult endeavor; it would be disturbing if it were not so. There is a tendency to overextrapolate from the few superior robot abilities, such as precise positioning and repetitive operation. Yet robots fare poorly when adaptation and intelligence are required. They do not match the human sensory abilities of vision, audition, and touch, human motor abilities in manipulation and locomotion, or even the human physical body in terms of compact and powerful musculature that adapts and self-repairs, and especially in terms of a compact and portable energy source. Hence in recent years many robotics researchers have turned to telerobotics, partly out of frustration.
Nevertheless, the long-term goal of robotics is to produce highly autonomous systems that overcome hard problems in design, control, and planning. As advances are made in robotics, they will feed through to better and more independent telerobots. For example, much of the recent work in low-level teleoperator control is influenced by developments in robot control. Often, the control ideas developed for autonomous robots have been used as the starting points for slave, and to a lesser extent, master controllers. Advances in high-level robot control will help in raising the level of supervisory control.
Yet the flow of advances can go both ways. By observing what is required for successful human control of a telerobot, we may infer some of what is needed for autonomous control. There are also unique problems in telerobotic control, having to do with the combination of master, slave, and human operator. Even if each individual component is stable in isolation, when hooked together they may be unstable. Furthermore, the human represents a complex mechanical and dynamic system that must be considered.
Relation to Virtual Environments
Telerobotics encompasses a highly diversified set of fundamental issues and supporting technologies (Vertut and Coiffet, 1985a, 1985b; Todd, 1986; Engelberger, 1989; Sheridan, 1992b). More generally, telerobots are representative of human-machine systems that must have sufficient sensory and reactive capability to successfully translate and interact within their environment. The fundamental design issues encountered in the field of telerobotics, therefore, have significant overlap with those that are and will be encountered in the development of veridical virtual environments (VEs). Within the virtual environment, the human-machine system
must allow translation of viewpoint, interaction with the environment, and interaction with autonomous agents. All this must occur through mediating technologies that provide sensory feedback and control. The human-machine interface aspects of telerobotic systems are, therefore, highly relevant to VE research and development from a device, configuration, and human performance perspective.
Yet the real-environment aspect of telerobotics distinguishes it from virtual environments to some extent. Telerobots must:
interact in complex, unstructured, physics-constrained environments,
deal with incomplete, distorted, and noisy sensor data, including limited views, and
expend energy which may limit action.
The corresponding features of virtual environments are more benign:
Form, complexity, and physics of environments are completely controllable.
Interactions based on physical models must be computed.
Virtual sensors can have an omniscient view and need not deal with noise and distortions.
The ability to move within an environment and perform tasks is not energy-limited.
Despite such simplifications, virtual environments play an important role in telerobotic supervisory control. A large part of the supervisor's task is planning, and the use of computer-based models has a potentially critical role. The virtual environment is deemed an obvious and effective way to simulate and render hypothetical environments to pose ''what would happen if" questions, run the experiment, and observe the consequences. Simulations are also an important component of predictive displays, which represent an important method of handling large time delays. VE research and development promises to revolutionize the field of multimodal, spatially oriented, interactive human-machine interface technology and theory to an extent that has not been achievable in the robotics field. The two fields should therefore not be viewed as disparate but rather as complementary endeavors whose goals include the exploration of remote environments and the creation of adaptable human-created entities.
This section reviews remote manipulators from standpoints of kinematics, actuation, end effectors, and sensors. Specific examples of robots
and telerobots in this review will tend to be drawn from more recent devices; some of the older telerobots are reviewed in Vertut and Coiffet (1985a). A review with similar categories is provided by Sheridan (1992a).
In this section we describe the number of joints and their geometrical layout. Some of the issues within kinematics are discussed below.
General Positioning Capabilities
A manipulator requires at least 6 degrees of freedom (DOFs) to achieve arbitrary positions and orientations. When a manipulator has exactly 6 DOFs, it is said to be general purpose. Examples include many industrial robots, such as the PUMA 560, as well as a number of commercial telerobots (Kraft, Shilling, Western Electric, ISE). The space shuttle's Remote Manipulator System (RMS), designed by Spar Aerospace, is another example.
If there are less than 6 DOFs, the device is said to be overconstrained. Often a task will require fewer DOFs, such as positioning (x - y) and orientating (a rotation θ about the normal z axis) restricted to a plane (a 3-DOF task). Another popular example is the SCARA robot geometry with 4 DOFs; the motions are planar with an extra translation in the direction normal to the plane. A modified SCARA robot, to which one joint was added, is being used for hip replacement surgery (Paul et al., 1992). Teleoperated heavy machinery usually is overconstrained; excavators have 4 DOFs (Khoshzaban et al., 1992).
An important subclass of mechanisms is a spherical joint, for which 3 rotations and no translations are required; this joint is useful in head-neck and head-eye systems. An example is the head-neck system described by Tachi et al. (1989). A 2-DOF pan-tilt system is presented in Hirose et al. (1992) and Hirose and Yokoyama (1992). Other pan-tilt systems are reviewed by Bederson et al. (1992), who also proposed a novel head-eye pan-tilt system employing a spherical motor. A 3-DOF parallel-drive head-neck system (Gosselin and Lavoie, 1993) has the potential for very fast motion, with some limitations in rotations.
When there are 7 or more DOFs, the mechanism is underconstrained. The extra DOFs may be used to fulfill secondary criteria (to general positioning), such as obstacle avoidance. There has been a lot of research in robotics addressing redundancy resolution. The human arm is a redundant
7-DOF mechanism (not counting shoulder shrug). Commercial examples include the Sarcos Dextrous Arm (Jacobsen et al., 1990a, 1990b, 1991), the Robotics Research Arm, and the Omnidirectional Arm (Rosheim, 1990). Laboratory examples include the Langley Laboratory Telerobotic Manipulator, the CESARm (Jansen and Kress, 1991), and the Anthropomorphic Tele-existence Slave Robot (Tachi et al., 1989, 1990a, 1990b). The Special Purpose Dextrous + Manipulator (SPDM) designed by Spar Aerospace will have 8 DOFs.
For direct control by the human arm, an exoskeleton master with 7 DOFs can be used to guide a slave 7-DOF manipulator. Hand-controllers (ground-based systems) are usually 6-DOF devices. To control a redundant arm, either the resolution is left to the discretion of the computer or an auxiliary control (such as a knob) must be manipulated.
The term workspace describes the extent of the volume within which the manipulator may position the end-point, relative to the size of the manipulator. Certain manipulator geometries are known to offer superior workspaces (Hollerbach, 1985; Paden and Sastry, 1988; Vijaykumar et al., 1985); interestingly, these geometries are similar to the human arm geometry.
Serial Versus Parallel Mechanism
In a serial mechanism, joints are cascaded. The workspace is the union of motions of the joints, and hence is large. Because a proximal link must carry the weight of distal links, these arms may be slower, heavier, and less strong. Most manipulators (whether in robotics or telerobotics) are serial mechanisms, because a large workspace is often important.
In a parallel mechanism, several independent linkages meet at a common terminal link (the end effector). The workspace is the intersection of the independent linkages, and hence is small. One independent linkage does not carry the weight of the others, so these devices can be lightweight, strong, and fast. A prominent example of a parallel mechanism is the Stewart platform used in flight simulators. The human hand can also be viewed as a parallel mechanism; there are 5 independent 4-DOF linkages that can contact an object. A lot of recent research in robotics has focused on parallel mechanisms, to exploit their intrinsic advantages for specific tasks suited to their restricted workspaces. Examples include a 6-DOF parallel manipulator to be teleoperated for excavation (Arai et al., 1992), a 3-DOF microrobot based on beam bending (Hunter et al., 1991), and 6-DOF parallel hand controllers (Hayward et al., 1993; Long and
Collins, 1992). Landsberger and Sheridan (1985) have designed a cable-driven parallel mechanism. The parallel-drive hydraulic shoulder joint in Hayward et al. (1993) uses actuator redundancy to increase the workspace.
For serial mechanisms, the forward kinematics (find the end-point position given the joint angles) is easy, but the inverse kinematics (find the joint angles given the end-point position) is hard. The inverse kinematics is complicated unless the mechanism has a special structure: either a spherical joint or a planar pair (Tsai and Morgan, 1985). Almost all industrial robots have these special structures, but some for design convenience do not, such as the Robotics Research Arm, which has been used in teleoperation. Because of fast computers, iterative techniques to solve the nonlinear kinematics can work in real time.
For parallel mechanisms, the reverse is true: inverse kinematics is easy, but forward kinematics is hard (Waldron and Hunt, 1991).
Actuation comprises the force or torque source (henceforth called the actuator) and any transmission element to connect to a joint or link. The actuation is the primary determinant of performance (speed, accuracy, strength). A survey of actuators for robotics is presented by Hollerbach et al. (1992).
For macro motion control, standard actuators are electric, hydraulic, or pneumatic. For smaller robots (human size and less), electric actuators dominate. For larger robots (e.g., cranes), hydraulic actuators dominate.
Electric Motors and Drives Electric actuators are the most convenient, because the power source is an electric plug. For pneumatic or hydraulic systems, air supplies and hydraulic supplies often make them much less easy to install and maintain. Electric actuators, however, are weak relative to their mass; hence payloads are not great.
To amplify motor torque and couple high-rev motors to low-rev joints, nearly all electrical motor drives employ some form of transmission element, primarily gears. Thus nearly all commercial electric robots employ gears of some form. For space applications, the space shuttle RMS employs special high-ratio hollow planetary gears (Wu et al., 1993). These same gears will be employed in the two-armed SPDM (Mack et al., 1991).
The Flight Telerobotic Servicer (FTS) system produced by Martin Marietta employed harmonic drives (Andary and Spidaliere, 1993; Ollendorf, 1990), which are also commonly employed in industrial robots (e.g., the Robotics Research Arm, ASEA robots). Advanced spherical joint designs employing gears have been produced by Rosheim (1990).
Yet gears bring serious drawbacks: substantial friction, backlash, and flexibility. The performance consequences are poor joint torque control, poor end-point force control, reduced accuracy, and slower response. To overcome these drawbacks, some attempts are made to model the gear dynamics so that they may be compensated for in a controller (Armstrong-Helouvry, 1991). Other attempts include better gear designs that reduce friction and losses; examples include the Artisan arm (Vischer and Khatib, 1990) and the ROTEX manipulator (Hirzinger, 1987; Hirzinger et al., 1993), which is meant for space laboratory teleoperation.
Cable or tendon drives (including belts) are another common way to reduce arm weight, by remote location of the actuators. A number of master-slave systems have been designed using cable drives. More recent examples include the FRHC from JPL (Hayati et al., 1990; Kan et al., 1990) and the Whole Arm Manipulator (Salisbury et al., 1990) (both Salisbury's design). Nearly all multifingered robot hands employ tendon or cable routing; space constraints preclude direct mounting of actuators at joints (Jacobsen et al., 1986).
Another recent development is the integration of gears and actuators. For example, Rosheim (1990) has used miniature integrated lead screw mechanisms for finger joint-mounted actuators. Similar systems, developed originally for ROTEX (Hirzinger et al., 1993), are now commercially available (Wittenstein Motion Control GmbH). A related concept is the harmonic motor, in which the rotor rolls along the inside of the stator (Jacobsen et al., 1989). Analogous to harmonic drives, with harmonic motors, high effective gear ratios can be obtained.
To avoid problems with transmission elements, direct drive actuators and robots have become popular in the past decade to provide smooth and controllable motion (An et al., 1988; Asada and Youcef-Toumi, 1987). Examples of direct drive telerobots include the CMU DDArm II (Papanikolopoulos and Khosla, 1992) and MEISTER (Sato and Hirai, 1988); MEISTER is also an example of a direct drive master. Advances in electric motor technology continue, and particularization to robotics is a key to enhanced performance. One example of a new electric motor specifically designed for direct-drive robotics is the McGill/MIT Direct Drive Motor (Hollerbach et al., 1993).
An important new area of development in mechanism design is magnetic bearings and levitation, which seek to avoid problems of transmission elements, including bearings as structural members. In principle,
devices with magnetic bearings should produce the smoothest motions. Hollis (Hollis et al., 1990) has designed a 6-axis magnetically levitated wrist, which can be used either as a hand controller or as a robot end effector. Salcudean (Salcudean et al., 1992) has developed a teleoperated robot, in which the master is a magnetically levitated wrist and the slave is an industrial 6-axis robot (coarse positioner) and the end effector is a magnetically levitated wrist. Because the wrist's range of motion is small, the hand controller is used in rate mode for large excursions and in proportional mode for fine motions. Although not magnetically suspended, a 2-axis force-reflecting mouse was developed by Salcudean using the same actuator elements (Kelley and Salcudean, 1993).
Another area under development, related to microelectromechanical systems (MEMS), is electrostatic actuators. All the electric motors mentioned above work by magnetic attraction. At small scales, the electrostatic effect is more favorable than the magnetic effect (Trimmer, 1989). By using small air gaps and many poles, powerful muscle-like actuators are conceivable. In terms of what has been realized on the macro scale, Higuchi has demonstrated lightweight but very strong linear electrostatic actuators (Niino et al., 1993).
Hydraulic Actuators Hydraulic actuators offer the most strength for the size. There are a number of commercial telerobot systems that are hydraulic, such as the Kraft arm, the Shilling arm, and the International Submarine Engineering (ISE) arm. Teleoperation of excavators (which are hydraulic) with hand controllers has also been pursued (e.g., Khoshzaban et al., 1992). To some extent, hydraulics have received a bad reputation due to concerns about leakage and controllability. Advances such as the Sarcos Dextrous Arm are a counterpoint to these concerns.
Pneumatic Actuators Pneumatic actuators are intermediate between electric and hydraulic drives, in terms of force produced for a given size and mass. There are very few high-performance robots powered pneumatically, because of control problems associated with the compressibility of air. Perhaps the most advanced example is the Utah/MIT Dextrous Hand Master (Jacobsen et al., 1986).
One of the more exciting new areas under development is micromotion robotics, in which (macro) robots are able to position precisely down to 1 nanometer (Dario, 1992). As a counterpoint to simulated molecular docking (Ouh-young et al., 1988), these robots would actually be able to manipulate molecules. The development of true microsize robots is still somewhere off in the future, but the new area of microelectromechanical
systems (MEMS) holds promise for developing the proper components: structures, actuators, and sensors.
For micromotion control, piezoelectric actuators dominate. They are used in scanning tunneling microscopes (STMs) and atomic force microscopes (AFMs). Hollis has used a magnetically levitated hand controller to control an STM (Hollis et al., 1990). A stacked actuator consisting of a linear voice coil motor in series with a piezoelectric element was the basis for Hunter's telemicrorobot (Hunter et al., 1991). Hatamura (Hatamura and Morishita, 1990) has teleoperated a 6-axis force-reflecting nanorobot whose individual axes are flexure elements activated by piezoelectrics; the masters are two bilateral joystick mechanisms, and vision is provided by a stereo scanning electron miscroscope (SEM).
Shape memory alloy (SMA) actuators hold considerable promise as a compact but powerful actuator source. Various robotic mechanisms have been proposed that incorporate SMA actuators, including active endoscopes (Dario et al., 1991; Ikuta et al., 1988). A tactile stimulator employing cantilever arrays activated by SMA wires has been developed commercially (TiNi Alloy Company). At the moment, two major drawbacks of SMA are highly nonlinear dynamics and slow response speed. Recent developments by Hunter (Hunter et al., 1991) have sped up the response by 100 times and hold promise for the future.
Most end effectors on robots or telerobots are unremarkable, usually two-jaw grippers or special purpose tooling. Multifingered robot hands have been developed to provide robots with the same dexterity as the human hand. The major commercial examples are the Stanford/JPL hand (Salisbury, 1985) and the Utah/MIT Dextrous Hand (Jacobsen et al., 1986). Master gloves have been used to teleoperate particularly the Utah/MIT Dextrous Hand (Hong and Tan, 1989; Pao and Speeter, 1989; Rohling and Hollerbach, 1993; Speeter, 1992). Force-reflecting multifingered master-slave systems have been developed by Jacobsen et al. (1990a) and Jau (1992). A 3-DOF hand partly inspired by prosthetics is the end effector for the Sarcos Dextrous Arm.
Sensor technologies for telemanipulators include sensors required to monitor the internal mechanical state of the arm (joint angle sensors and joint torque sensors), the external contact state (wrist force/torque sensors and tactile sensors), and proximity sensors. We do not cover visual
sensors (cameras) and image processing. Position trackers and inertial sensors are reviewed in Chapter 5.
Joint Motion Sensors
A review of traditional joint motion sensors is provided by deSilva (1989). For rotary motion, common sensors are potentiometers, optical encoders, resolvers, tachometers, Hall-effect sensors, and rotary variable differential transformers (RVDTs). For linear motion, common sensors are linear variable differential transformers (LVDTs), linear velocity transducers (LVTs), and position sensitive detectors (PSDs).
Most of the rotary sensors listed above are analog sensors. Potentiometers are not that favored because of noise and sensitivity problems, and they are hard to make small. For use in robot fingers and hand masters, compact Hall-effect sensors are used in the Utah/MIT Dextrous Hand and in the EXOS Dextrous Hand Master. The resolution is not high (0.2 deg), but is adequate for these devices. The VPL DataGlove employs fiber optic sensing, but this effect is too coarse to be really useful and there have been many complaints about the accuracy of this system. Resolvers and tachometers are suitable for larger actuators and joints, such as robot shoulders and elbows. RVDTs are moderate in size but also have a moderate resolution.
The trend is increasingly toward digital sensors. Optical encoders offer the highest resolution; for example, Canon produces an incremental laser rotary encoder with 24 bits of resolution. The Sarcos Dextrous Arm has 18 1/2 bit incremental encoders. The trend is for rotary encoders to become less expensive while maintaining high resolution, to become more compact in size, and to provide high-resolution absolute joint angle readings (Steve Jacobsen, personal communication).
For linear transduction, LVDTs and LVTs are common. Resolutions in the range of 10-100 nm are possible for LVDTs. Linear PSDs have been reported to have resolutions in the range of 1-5 nm. Digital linear sensors are being developed with a resolution of 2.5 nm (Steve Jacobsen, personal communication). The ultimate in high-resolution linear measurement is of course interferometry, for which resolutions of 0.1 nm are possible. An additional consideration is the sampling rate while maintaining high resolution; Charette et al. (1993) reported a 1 MHz rate. Future developments should result in reduced size of such sensors and increased use in micromanipulation.
With increased resolution such as that provided by optical encoders, the calculation of joint velocity and acceleration from positional data will become more accurate. This calculation is required for precise control
and calibration. Recent research has focused on how these derivatives are to be calculated (Belanger, 1992; Brown et al., 1992).
Joint Torque Sensors
Strain gauges are most commonly used for force and torque sensing; the review by deSilva (1989) is again relevant. Typically some flexible structure is attached in series with a joint axis; an example is the Sarcos Dextrous Arm, with torque sensors having a dynamic range of 1:2,000. Autonomous calibration of joint torque sensors was considered by Ma et al. (1994). Joint torque sensors have been retrofitted to PUMA robots by Pfeffer et al. (1989), to the Stanford Arm by Luh et al. (1983), and to a direct drive arm by Asada and Lim (1985). An instrumented harmonic drive for joint torque sensing was presented by Hashimoto et al. (1991).
Displacement sensors may also be employed to measure joint torque; for example, inductive sensors were employed by Vischer and Khatib (1990) and in ROTEX. A variable reluctance joint torque sensor is also discussed by deSilva (1989). Hall-effect sensors on a cantilever system are employed to sense tendon tension on the Utah/MIT Dextrous Hand. Optical joint torque sensors using light emitting diodes have been developed by Hirose and Yoneda (1989). Displacement sensors can have advantages over strain gauge sensors, such as lower cost and higher robustness, although the sensitivity is typically lower. The future will probably continue to see alternatives to, and a movement away from, strain gauge sensing as micro positional sensors improve.
Another trend is the use of improved electric motor models to predict torque accurately open loop. This may involve the design of new motors (Hollerbach et al., 1993; Wallace and Taylor, 1991) or the reverse engineering of existing motors (Newman and Patel, 1991; Starr and Wilson, 1992). When a transmission element is employed, one alternative is a careful characterization of joint friction to compensate for its effect (Armstrong-Helouvry, 1991).
Accurate knowledge of joint torque is very important for precise control and, in the context of teleoperation and haptic interfaces, for force reflection. Despite this importance, very few manipulators actually have this capability. The development of torque-controlled joints will be essential for higher performance devices in the future.
Wrist Force/Torque Sensors
An alternative, or a complement to, joint torque sensing is to employ multiaxis force/torque sensors, usually mounted at the wrist. Such sensors have also been used as haptic interfaces, such as the Trackball or
Spacemouse. Sensor technology is the same as that discussed under joint torque sensors, but the multiaxis characteristic offers substantial complications.
Most frequently, a 4-beam arrangement in a Maltese cross configuration has been employed with strain gauges; commercial examples include the JR3 sensor and the Assurance Technology sensor. A significant problem is cross-axis interference, due to nonlinear beam bending (Flatau, 1973; Hirose and Yoneda, 1990); this effect may produce errors up to 3 percent. Although more complex calibration may alleviate this effect, another approach is to use alternative structures. Nakamura et al. (1988) proposed a boxlike sensor. Yoshikawa and Miyazaki (1989) proposed a three-dimensional cross-shaped structure. Another possibility is a membrane suspension (Gerd Hirzinger, personal communication).
As an alternative to strain gauges, the use of optical sensing has been proposed (Hirose and Yoneda, 1990; Hirzinger and Dietrich, 1986; Kvasnica, 1992). The most precise multiaxis force/torque sensor built to date employs a magnetically levitated wrist and optical sensors (Tim Salcudean, personal communication). The wrist is servoed to a null position, and a motor model is employed to infer the forces and torques; hence there is no cross-coupling. This idea is similar to that of the Sundstrand accelerometers, the most accurate on the market.
There is considerable room for improvement in the market for commercial 6-axis force/torque sensors. Sometime in the future we can expect accuracies of around 0.1 percent, including cross-coupling effects; this would represent about an order of magnitude improvement over those currently on the market.
There have been a number of reviews of tactile sensing technology (Dario, 1989; Dario and De Rossi, 1985; Hollerbach, 1987; Howe and Cutkosky, 1992; Nicholls and Lee, 1989; Pugh, 1986). Many different effects have been employed; piezocapacitance, piezoresistance, and piezoelectrics are some of the more common ones. Tactile sensors have also been produced using optical sensing (Maekawa et al., 1992). Very large scale integrated (VLSI) fabrication methods have also been employed to produce tactile sensors.
Commercially, piezoresistive tactile sensors were produced by the former Lord Corporation and by Barry Wright Controls Division. Piezoresistive inks have been employed in the Interlink Electronics tactile sensors. Very few other examples of commercially available tactile sensors may be found.
Hysteresis, sensitivity, and repeatability are often problems with
piezoresistive sensors. Piezocapacitance sensors circumvent some of these problems. Piezoelectric sensors are temperature sensitive, often function only in an AC mode, and are hard to make very small because of cross-talk. Other sensor technologies are often too complicated or fragile to make useful devices.
The vast majority of tactile sensors sense only normal force. Multiaxis stress sensors have been proposed by De Rossi et al. (1993) and McCammon and Jacobsen (1990). In the context of teleoperation and tactile stimulation, tactile sensors for normal force are probably adequate because tactile stimulators are likely only to be able to produce normal force.
The bottom line is that, despite all the published work on tactile sensors, almost none is used on robots. The problems have to do with packaging, cost and complexity, response properties, robustness, and suitability for curved surfaces such as fingertips. This is a technology area that needs considerable development, although economic drivers may not be in place for it.
Proximity sensors, intermediate between contact sensors and visual sensors, are used for docking maneuvers of a manipulator end effector with an object or target. They are particularly useful in teleoperation to account for model discrepancies and to compensate for obstructed vision. For example, proximity sensors in the end effector of the ROTEX manipulator play an important role.
Main technologies include electromagnetic, optical, and ultrasonic proximity sensors. Electromagnetic sensors (induction, capacitance) are limited in range and detectable materials (Novak and Feddema, 1992). Ultrasonic sensors are not useful for short-range measurements. Hence most proximity sensing has hinged on optical reflectance sensors. A review of such sensors is provided by Espiau (1987). A challenge for these sensors is to separate the effects of distance, surface orientation, and reflectance properties. Multiple detectors are one way of inferring surface orientation (Lee et al., 1990; Okada, 1982; Okada and Rembold, 1991; Partaatmdja et al., 1992). An optical proximity sensor based on the confocal principle has been reported by Brenan et al. (1993).
Remote vehicles, or mobile robots, encompass any basic transport vehicle that can be operated at a distance: indoor motorized carts, road vehicles, off-terrain vehicles, airborne or space vehicles, boats, and submersibles.
Many mobile robots also will carry one or more remote manipulators. This section highlights mobile robots exemplifying the current state of the art, major issues involved in the development of mobile robotic systems, and remaining research and development challenges.
The arguably perfect mobile robotic system would: (1) be easy to control or program, (2) automatically transit an unstructured, highly complex, dynamic environment, (3) automatically perform general sensory and manipulative tasks, and (4) if required or desired, transmit detailed, easily interpreted, sensory information describing its environment and task state in real time. It would be capable of performing these tasks for long periods of time, over long distances, and would not require a physical tether for either power or data transmission. Unfortunately, such a system is not currently technically feasible or even physically achievable for some scenarios.
Environmental and physical factors attenuate data transmission bandwidth with distance; platform design places constraints on the environments that can be traversed; available energy systems limit endurance; sensors and sensor processing technology limit the type, form, and reliability of information about the environment available to the robot; the state of the art in high-level control limits the robot's autonomous capabilities; and available computational devices limit how much sensor processing and high-level control can be embedded in the remote system. In addition, reliability, volume, and cost issues exert a strong influence on the designs of current mobile robotic systems. In spite of these constraints, however, highly successful mobile robotic systems have been developed. These systems can be classified into four major physical configuration classes based on different weighting of endurance, maneuverability, automation, and cost attributes.
Class 1 Systems: Power and Data Tethered
The power and data tether that characterizes Class 1 vehicles allows these systems to be optimized for endurance and cost. Due to the use of a power and data tether, mission duration is essentially unlimited and telemetry can be very high-bandwidth and immune to noise, jamming, and occlusion. Range, however, is limited by tether length and the tether is subject to entanglement. In addition, combined power and data tethers tend to be bulky and can impart tremendous drag to the remote vehicle, thereby limiting maximum achievable speed. As a rule, Class 1 systems are teleoperated and have minimal on-board automation (usually limited
to closed-loop servo-control of actuators). All major navigation and strategy decisions are made by a human operator using vehicle navigation, collision avoidance, and scene understanding sensors. Human-machine interfaces for these types of systems range from relatively simple collections of analog and symbolic interface devices to more sophisticated systems with stereoscopic video feedback and force-reflecting manipulator controllers. Simple short-range land vehicles and a majority of undersea vehicles are been exemplars of Class 1 vehicles.
Typical of Class 1 vehicles are the dozens of low-cost, commercially available, remotely operated vehicles developed for undersea inspection or light work tasks to depths of a few thousand feet. An example is the Hydrovision Ltd. Hyball undersea inspection system (Busby Associates, 1990). This small (.46 m × .51 m × .47 m), lightweight (39 kg) system has an on-board video camera on a pan and tilt and work lights. The Hyball can operate to 300 m and is operated using a simple video display and joystick. It can be outfitted with a scanning sonar, a low-light level camera and auto altitude system.
Representing the high end of Class 1 vehicles, the Advanced Tethered Vehicle (ATV) (Morinaga and Hoffman, 1991; Busby Associates, 1990) is a large (6 m × 3 m × 2.5 m, 5,000 kg) undersea work system developed by the Navy for general undersea repair and recovery tasks at full ocean depth. It currently holds the world depth record for a tethered vehicle (20,600 ft) and is capable of speeds to 2 kn. Its overall sensor and actuator complement includes: (1) four video cameras—a stereo pair, a single camera with zoom lens on pan/tilt devices, and a fixed camera for position reference; (2) three manipulators—two force-reflecting arms (6-DOF arm and 1-DOF gripper) and a simple grasping device; (3) a scanning forward looking sonar; and (4) depth and heading sensors. Navigation is augmented by a long-baseline acoustic positioning system. A sophisticated van-based control system includes a vehicle driver and a manipulator or work system operator console. The manipulator operator console contains a stereoscopic panel-mounted display and a pair of human-sized, replicate, force-reflective master controllers. The vehicle driver console has access to video, sonar, and navigation information. The ATVM-Us, which has a 1.2 inch diameter, 23,000 ft power and data tether, represents one of the project's major contributions to Class 1 underwater vehicles (Freund, 1986).
Class 2 Systems: Data Tethered
Class 2 systems are highly maneuverable yet cost effective. Remote vehicle power requirements and overall system costs are minimized by relying heavily on the human operator for sensory and control decisions
rather than on-board automation systems. Access to remote vehicle actuation and sensory capabilities is provided by a tethered telemetry system. Data tethers, typically fiber optic cables, are much less bulky than power tethers. These cables can support very high-bandwidth, secure, nonjammable, non-line-of-sight telemetry to ranges of over 100 km without repeaters, and they do so without imparting significant drag to the remote vehicle, since they can be actively or passively payed out from it. Fiber optic cables cost approximately $1-2 a meter and, depending on the application, may not be reusable. Due to the possibility of cable entanglement or breakage, a nontethered, low-bandwidth, non-line-of-sight telemetry system is frequently employed as a minimal capability backup. Mission duration of Class 2 systems is limited due to the requirement to carry on-board energy sources. This class of mobile robots has historically received the most interest in the human-machine interface area, since they are aimed at being highly maneuverable and capable yet still possesses continuous high-bandwidth telemetry. Some air vehicles, more advanced undersea vehicles, and most land vehicles capable of executing realistic missions are exemplars of Class 2 systems.
An example of the latter is the Department of Defense Unmanned Ground Vehicle Program TeleOperated Vehicle (TOV) system, an exterior, off-road capable, surveillance robot (Aviles et al., 1990). The remote vehicle platform is based on the military's four-wheel drive utility vehicle, the High Mobility Multi-purpose Wheeled Vehicle (HMMWV). It is configured as a modular, remotely operated mobile platform with a fiber optic data link to provide high-bandwidth telemetry out to 30 km. In addition to making all basic vehicle functions remote, a stereoscopic camera pair and two artificial pinnae are mounted on a pan and tilt platform to provide feedback for remote driving. Navigation information is provided by a satellite-based navigation system that performs dead reckoning between satellite updates. Up to three add-on mission-specific subsystems (mission modules) can be added to the base vehicle. The usual mission module is a reconnaissance, surveillance, and target acquisition system that includes a low-light level video system, a forward-looking infrared sensor, and a laser range finder and designator. All these sensors are mounted on a pan and tilt platform on top of an extendable 15 ft scissors mast.
The TOV control station is mounted in a mobile shelter and is designed to provide the human operator with a control and sensory experience as similar as possible to normal, nonremote driving. The operator is provided with replicas of the HMMWV steering wheel, accelerator, brake, shifter, and ignition controls. Feedback for driving is provided primarily by a stereoscopic head-mounted display (HMD) with a binaural headset. The remote vehicle camera pan and tilt is slaved to the operator's head
motions while wearing the HMD, and basic navigation information is overlaid onto the operator's visual scene. The TOV has been extensively tested in both on-road and off-road conditions to 55 km/hr and can be remotely driven to the limits of the basic platform.
Sometimes Class 2 systems, such as the XP-21, a modular undersea vehicle developed by Applied Remote Technology (Busby Associates, 1990), are used as test beds for autonomous control. The XP-21M-Us data tether allows use of powerful off-board computational systems and quick reconfigurability. The XP-21 is approximately 5 m in length and .5 m in diameter, has a maximum speed of 6 kn, and a 40 mi cruise range. It can also run on preprogrammed missions without the tether.
Class 3 Systems: Nontethered Telemetry
Class 3 systems fall into two major categories. The first contains mobile robots that have continuous, high-bandwidth, line-of-sight, non-tethered telemetry systems. These mobile robots are equivalent to Class 2 systems but are limiting their range of operation in order to remove the disadvantages of a physical data tether. Sometimes a Class 2 system will be put in a configuration of this type for training purposes or short-range missions and use cable only for extended-range missions.
The second category of Class 3 systems represents a uniquely different approach to the development of mobile robots. On-board automation is emphasized in order to remove the physical data tether yet still be capable of performing long-range missions. Class 3 systems have a telemetry connection to their control station, but it is a low-bandwidth, non-line-of-sight, connection incapable of supporting direct manual control by the human operator. These systems typically exhibit at least supervisory-level control and can often perform reasonably complex behaviors autonomously. The human's role is one of a supervisor, giving high-level commands to the remote vehicle and monitoring its progress. This level of control is not only a goal for human/operator load reduction purposes but also is a requirement for stable control of the remote platform under telemetry-induced delays (Ferrell, 1965; Sheridan, 1970). High-resolution imagery is often selectively telemetered to the operator at very low frame rates (on the order of seconds per frame). Mission duration is still limited by on-board energy systems but through intelligent power management approaches this can be significantly extended. This class of systems has historically received reasonable interest in the human-computer interaction arena as relating to control partitioning and sharing (Chu and Rouse, 1979), but relatively little attention has been paid to remote presence approaches. Most air vehicles, more advanced undersea vehicles, some land vehicles, and planetary rovers have been exemplars of this type.
Rocky III (Wilcox, 1992; Desai et al., 1992) is a 15 kg, 6-wheel, planetary rover test bed developed at the California Institute of Technology's Jet Propulsion Laboratory. It has a 9,600 baud radio telemetry system, on-board computation, and a 3-DOF arm outfitted with a soft soil scoop. Rocky III's on-board batteries provide a 10-hour mission duration. Very simple navigation and collision avoidance sensors are used. Navigation is accomplished using a gimballed flux-gate compass and wheel encoders for dead reckoning. Collision avoidance information is provided by sensors connected to the front wheels and to a skid plate for objects that go between the wheels. Using the telemetry system, an operator designates a site to be sampled with the soft soil scoop and optional intermediate way-points. The rover then accomplishes its mission, including obstacle avoidance maneuvers, with no further communication. Rocky III's larger cousin, Robby (Desai et al., 1992), a 6-wheeled 3-body, 1,000 kg, articulated vehicle, has demonstrated semiautonomous navigation through a rough natural terrain at a rate of 80 meters per hour using both deliberative and reactive control paradigms on stereo-vision-provided data.
The Mobile Detection Assessment and Response System (MDARS) program, a joint effort of the U.S. Army's Armament Research Development and Engineering Center and the Navy's Naval Command, Control and Ocean Surveillance Center, has developed an interior, supervisory controlled, physical security robot as an adjunct to fixed security sensors (Everett et al., 1990, 1993; Laird et al., 1993). The 3-wheel drive, 3-wheel steered, remote platform is 6 feet tall and weighs 570 pounds. It is outfitted with a 9,600 bit per second bidirectional radio telemetry system, navigation sensors, and intruder detection sensors. Collision avoidance sensors include a 9-element ultrasonic array and bumper-mounted collision detectors. Navigation is accomplished using a hybrid navigation scheme that combines compass/encoder-based dead reckoning and a wall-following/reindexing system. The wall-following system updates and refines the robot's computed position using an a priori map of static features in the environment and readings from acoustic ranging sensors. Intruder detection sensors include a 360 deg, 24-element ultrasonic array, microwave motion detectors, passive infrared motion detectors, a video motion detection system with a near-infrared light source, and near-infrared proximity detectors. The mobile platform can automatically follow preprogrammed or random paths and performs automatic obstacle detection and avoidance maneuvers. It periodically stops to look for intruders and alert a human supervisor when an on-board security assessment system determines that an intruder is likely (Smurlo and Everett, 1992). The operator then has the option of (1) ignoring the alert and ordering the robot to continue its patrol, (2) asking the robot to get closer to the detected object for evaluation using the on-board video camera, or (3) taking
over control in a reflexive teleoperated mode (automatic collision avoidance). Initial tests of the system in military warehouse environments have demonstrated probabilities of detection well in excess of 0.90 with a very low false alarm rate. The system is being extended to allow the supervision of multiple mobile platforms by one operator.
In the underwater environment, the Advanced Unmanned Search System (AUSS) (Walton et al., 1993; Uhrich and Walton, 1993), developed by the Navy, is a supervisory controlled, broad-area, undersea search system. The remote vehicle is 17 ft long and 31 inches in diameter, weighs approximately 2,800 lb, has an endurance of 10 hr and a maximum velocity of 5 kn, and can operate to depths of 20,000 ft. An acoustic link transmits compressed search data from the vehicle at 4,800 bits/s and sends high-level commands to the vehicle at 1,200 bits/s. The primary search sensor is a side-looking sonar. Electronic still and 35 mm film cameras provide imagery for identification. Depending on the amount of compression desired, sonar and video images take from 20 s to 2 min to transmit. On-board navigation sensors include a forward-looking sonar, a Doppler sonar, gyro-compass, depth sensor, attitude sensors, and rate sensors. In addition, bottom-deployed long-baseline acoustic transponders and ship-based short-baseline acoustic, Loran-C, and global position system (GPS) navigation systems can be used to update the remote vehicle navigation system and to allow the surface support craft to maneuver to maintain the acoustic telemetry link. In a typical scenario, the AUSS system operator commands the remote vehicle to execute a search path and supervises the system by monitoring vehicle position, status, and transmitted imagery. If an object of interest is detected by the operator, the vehicle can be ordered to automatically home in on the object and get higher resolution video imagery for evaluation.
Class 4 Systems: Nontethered, No Telemetry
The final class of systems represents the perceived high ground of mobile robotics research and development. The premium on on-board automation is extremely high, and the remote vehicle carries out its mission without requiring human monitoring or intervention. The human is involved only in programming or specifying the desired high-level behavior of the system and possibly in retrieving mission or sensory data after the mobile robot has returned from an excursion. This means that all sensor regard control, interpretation, and the reasoning required to transit within the environment without collisions must occur on board. Class 4 systems do not need a telemetry connection to their control station and therefore can be highly maneuverable and operate to long distances. Mission duration, like Class 2 and Class 3 vehicles, is still limited by on-board
energy sources but again can be extended tremendously by intelligent power management. In addition to being an intellectual focus of the mobile robotics community, Class 4 systems can perform tasks for which maintaining telemetry would be problematic or impossible, such as excursions deep inside of structures. This class of systems has historically received minimal attention in the human-machine interface area, since the overall effort is to limit human involvement to goal specification at most. As yet, systems capable of rapid transit in general, unconstrained environments without a priori knowledge of that environment do not exist. Some very interesting systems that operate in more constrained environments have been constructed, however.
The Carnegie Mellon University (CMU) Navlab and Navlab II (Kanade, 1992; Mettala, 1992; Thorpe, 1990) mobile ground robot test beds have demonstrated impressive performance at road following and cross-country traversal. These vehicles navigate using sonar, gigahertz radar, and an ERIM laser rangefinder. The best runs to date over moderate off-road terrain have occurred at 6 mph. ALVINN, a neural network road-following system, has been used by the CMU researchers to drive the Navlab II up to 62 mph on highways and for a continuous distance of over 21 miles. The neural network is trained by observing a human driver. Image understanding for mobile robotic applications, as exemplified by the CMU work, is currently the focus of intense research sponsored by the Advanced Research Projects Agency (1992).
Technologies and Directions
Although all the major technology areas depicted in Figure 9-1 continue to be the focus of intense research and development, the most significant and relevant developments have been in the sensor, platform, actuator, high-level robotic control, and human-machine interface fields.
One of the classic problem areas constraining the development of mobile robotic systems has been the development of sensors supporting navigation and collision-free transit through the environment. The platform must be able to navigate from a starting position to a desired new location and orientation, avoiding any contact with fixed or moving objects en route. The difficulty can be directly related to the requirement for the platform to move at reasonable speeds and the unstructured nature of the operating environment.
Navigation Sensors and Systems Major techniques for determining vehicle position and orientation are dead reckoning, inertial navigation, beacons,
satellite navigation, and map matching. Recent developments in small, low-cost inertial linear accelerometers and angular rate sensors and the maturation of global positioning system (GPS) technology, however, are of particular import for mobile robot navigation and sensing. Inertial and GPS technologies are highly complementary. Inertial systems are very precise for short distances and short times but are subject to long-term drift. GPS systems, in contrast, provide somewhat less accurate but real-time updates over long periods of time and long distances without drift.
Standard GPS technology provides a location solution with an accuracy of about ± 30 m. The use of differential GPS techniques, through which correction factors from a fixed GPS receiver station are transmitted to the mobile platform and its GPS receiver, allow a positional accuracy of approximately ± 2 m. GPS receivers are now available in both hand-held and chip-set versions, such as the Rockwell NavCore V Positioning System Receiver Engine. In addition, multiantenna systems such as those developed by Trimble Navigation use signal phase time-of-arrival information to provide both orientation and more accurate position information.
Silicon-based miniature accelerometers are challenging the dominance of more traditional rate and acceleration sensors, such as quartz beam or fiber optic gyroscopes. In general, silicon-based microaccelerometers
measure the displacement of a proof mass attached to a silicon chip in response to inertial force by piezoresistive, capacitive, or resonant means (Yun and Howe, 1992). For example, Triton Technologies Inc. has developed an accelerometer using capacitive sensing with 0.1 mg resolution, 120 dB dynamic range, and a cross-axis sensitivity of less than 0.001 percent (Henrion et al., 1990). Researchers at the Berkeley Sensor and Actuator Center have developed a capacitive accelerometer that integrates the sensor and readout electronics in a 2.5 mm × 5.0 mm area (Yun et al., 1992). Pointing to potential future performance enhancements, researchers at JPL's Center for Space are exploring the use of electron tunnel sensors for measuring acceleration. These systems hold the promise of surpassing the performance of capacitive sensors by four orders of magnitude (VanZandt et al., 1992). Magnetohydrodynamic angular rate sensors (Laughlin et al., 1992) also show promise with high precision (> 0.1 µrad) and dynamic range (> 100 dB) at low-power consumption (< 0.3 W). To date, however, sensors of this type capable of measuring constant input rates are still under development.
These advances in GPS and inertial technology are allowing the development of lightweight, low-cost, integrated GPS/inertial navigation systems specifically targeted for remote vehicle applications. A prototype system, built by Rockwell International, weighs 5.5 lb, draws 18 watts, and has a volume of 115 cubic inches (Griffin and Murphy, 1992). A NavCore V GPS chip set and tactical grade inertial sensors using piezoelectric bender crystals mounted on a motor shaft are utilized. Initial positional error specifications are 76 m SEP (spherical error probable) with future systems slated to have positional accuracies on the order of 15 m SEP by using the military version of GPS and solid-state accelerometers.
Collision Avoidance and Scene Understanding Sensors and Systems Acoustical, optical, and electromagnetic sensors using proximity, triangulation, time of flight, phase modulation, frequency modulation, interferometry, swept focus, and return signal intensity ranging techniques have been employed on mobile robots for collision detection and scene understanding purposes (Everett et al., 1992). Recent developments in millimeter wave radars and laser radars are especially relevant to the mobile robotics community.
Millimeter wave (MMW) radar utilizes that portion of the electromagnetic spectrum from wavelengths of approximately 500 µ to 1 cm. In theory, MMW-sensing systems can have much higher resolution and fit into smaller packages than conventional radar systems in the microwave portion of the electromagnetic spectrum (approximately 3 to 100 gHz) with some penalty in shorter operating distances and more attenuation
by environmental factors. Although, MMW radar systems are currently not widely commercially available, several promising prototype systems have been developed. Examples of current MMW radar sensors are systems developed by Millitech, Battelle, and Kruth-Microwave. Millitech has developed two prototype sensors for robotic collision avoidance applications (Millitech, 1989). The first sensor has a 30 × 30 degree field of view, a maximum range of 10 m, a range resolution of 1 cm, and an update rate of 100 Hz. It is targeted at providing the range to the nearest obstacle. The second sensor is designed to scan 360 degrees in azimuth, track multiple targets, and determine range and bearing for each. It has a maximum range of 100 m, a 2 degree azimuth resolution, a range resolution of 10 cm, and an update rate of 40 Hz. Researchers at Battelle Memorial Institute have developed an MMW radar system for use as an automobile collision-avoidance sensor (Wittenburg, 1987). It allows range, velocity, amplitude of returned signal, and angle to be determined for multiple targets. In an envisioned configuration for automotive use, it would scan a 30 deg sector at a 5 Hz update rate with a 5 m range resolution out to 60 m and a 10 m range resolution out to 100 m. Finally, Kruth-Microwave Electronics Company has developed a low-cost, low-power prototype MMW radar system for unmanned system use (Kruth-Microwave Electronics, 1989). It has a theoretical range of 1 km and a range resolution of better than 1 m, can detect targets of approximately 0.1 m cross-section, and can be packaged in approximately one-third of a cubic ft volume.
Laser radar (LADAR) or laser scanner technology for terrestrial applications is now relatively mature and can be used for both obstacle detection and landmark identification (Besl, 1988; Hebert et al., 1988). Systems developed by the Environmental Research Institute of Michigan (ERIM) and Odetics exemplify the state of the art. The ERIM laser scanner, used by Carnegie Mellon University on its Navlab series of mobile robots, provides resolutions of 0.5 deg/pixel horizontal (80 deg field of view) by 0.3 deg/pixel vertical (30 deg field of view). It has a maximum unambiguous range of 20 m, a range resolution of 8 cm, and a 2 Hz update rate. The Odetics system provides 0.5 deg/pixel over a 60 deg vertical and horizontal field of view. It has a resolution of 1.8 cm out to a range of 9.4 m.
Along a similar front, laser-based imaging systems for the underwater environment are being developed (Fletcher and Fuqua, 1991). These systems typically use lasers capable of producing energy in the blue/green spectrum (wavelengths in the range of 460 to 560 nm), which corresponds to a window of minimum absorption in seawater. SPARTA Incorporated has developed a range-gated imaging system that has dimensions of approximately 12 × 27 in, a 12 deg field of view, a 10 Hz update
rate; it weighs approximately 120 lb (neutral in water), and consumes 450 watts of power (Swartz, 1993). Westinghouse has continued development and run sea trials on a blue/green laser line scanner with fields of view from 15 to 70 deg (Gordon, 1993).
In the area of platform design, research and development on legged robots is particularly important. In his overview of the field, Raibert (1990) suggests that legged locomotion research is well advised, since only half of the earth's land mass is navigable using current wheeled and tracked vehicles. In addition, not only is legged platform research attempting to develop platforms that can traverse difficult terrain but also it has served as a focus for understanding human and animal locomotion. This work has led to a reasonable understanding of gait under a variety of regimes (Song and Waldron, 1989) and to a variety of commercial and research platforms. Odetics Inc. has built a series of supervisory controlled hexapods. Its first legged platform, the ODEX I (Russell, 1983), weighed 370 lb, could lift 900 lb, and could walk onto a small truck bed. The Adaptive Suspension Vehicle, a 6000 lb, 6-legged, terrain-adaptive suspension vehicle developed at the Ohio State University (Waldron and McGee, 1986; Song and Waldron, 1989) has a single human operator and a 5 mph top speed and is capable of crossing 6 ft ditches, stepping over or onto obstructions over 4.5 ft high, and navigating 60 percent grades. Raibert at CMU and MIT has developed a variety of monoped, biped, and quadruped test beds including a planar biped capable of jumping through hoops and running at speeds of over 13 mph (Raibert, 1990). Legged platform research and development is particularly relevant to the design and control of figures and autonomous agents in virtual environments (see Badler et al., 1991).
High-Level Robotic Control
A relatively new and controversial approach to high-level control for mobile robotics is reactive control. This form of control and its subset, behavior-based control, rely heavily on the immediacy of sensory information, channeling it directly to motor behaviors without the use of an intervening symbolic representation of the world. More traditional strategies construct representations or models of the world and then reason based on these models prior to acting (Brooks, 1986, 1991; Jones and Flynn, 1993). Approaches relying on this traditional approach have typically been hampered by sensor-produced inaccuracies in the world model and the significant amount of computation required to reason about the abstract
model. Reactive systems do not possess these two weaknesses because they bypass the creation of a world model altogether and directly couple sensory perceptual activities with action (Arkin, 1992). The ultimate limits on useful adaptive behavior achievable using reactive control approaches are unclear. A variety of systems have been developed that perform in dynamic, unstructured environments (Brooks, 1990; Connell, 1990; Anderson and Donath, 1991). For example, Brooks and his colleagues at MIT have built a series of mobile robots, including a small 6-legged robot (Brooks, 1990) that uses simple pitch and roll sensors, passive infrared sensors, and whiskers to exhibit robust autonomous walking, prowling, and directed prowling behaviors that avoid collisions. This level of competency is achieved without the generation of a symbolic representation of the world.
Human-machine interfaces for mobile robots range from simple analog or symbolic controls and displays maintaining minimal isomorphisms with the system to be controlled to highly immersive, spatially oriented, isomorphic, veridical human-machine interfaces engaging the visual, auditory, and haptic senses. Interfaces of the latter type have typically been designed in order to provide the human operator with some sense of remote presence or telepresence providing sensor feedback of sufficient richness and fidelity and controls of sufficient transparency that human operators feel as if they are present at the remote site. This approach is typically taken in order to engage the human's naturally evolved sensory, cognitive, and motor skills in the ways they are used in everyday tasks so as to minimize task completion times and the training required to operate the remote system (Pepper and Hightower, 1984). The definition and efficacy of remote presence, however, are currently a major research topic within both the robotics and virtual environment communities (Held and Durlach, 1992; Sheridan, 1992b). The type and form of the human-machine interface, although often not clearly separated, is orthogonal to the level of control (manual through autonomous) of the mobile robotic system. The type and form of the elements and level of information to be controlled, programmed, or monitored vary whether the mobile robot (or virtual entity in a VE) is manually or autonomously controlled, but similar human-machine interaction approaches, symbolic to immersive, can be applied.
Although a complete overview is beyond the scope of this paper, mobile robotics research and development on the human-machine system is highly relevant to the VE community. Not only have successful systems been developed that attempt to create a sense of remote presence
(Aviles et al., 1990; Morinaga and Hoffman, 1991) but a wide body of research outlining the time, speed, accuracy, and configuration trade-offs of different human-machine configurations exists (Sheridan, 1989, 1993; McGovern, 1990; Spain, 1992; Heath, 1992).
LOW-LEVEL CONTROL OF TELEOPERATORS
Teleoperators are complex systems composed of the human operator, master manipulator (joystick), communication channel, slave manipulator, and the environment (remote task). Since each of these systems is complex in its own right, in combination they create formidable analytical and design challenges. In particular, when slave contact force information is fed back to the operator through the master manipulator, the system becomes closed loop and thus stability is often a problem, even if each of the individual subsystems is stable in isolation. A related technology not considered here consists of ''man-amplifiers" or "extenders" through which master and slave are effectively combined into one mechanism in direct contact with (or worn by) the human (Kazerooni and Guo, 1993).
The most common controller for robot manipulators in practice remains the classic proportional-integral-directive control (PID) compensator used on the individual joint positions. Experimental systems have become more sophisticated. For example, to accurately follow trajectories, robot controllers are greatly assisted by incorporating a dynamic model of the manipulator (Luh et al., 1980; Khosla, 1988).
Most teleoperators are used in tasks involving heavy contact with rigid or massive environments. Position and contact force cannot be simultaneously controlled because their relation is constrained by the environment. Thus, the task space can be segmented according to the contact constraints into subspaces in which position and force are individually controlled (Mason, 1981; Raibert and Craig, 1981). Alternatively, position error and force error can be related to each other through a stiffness constant (Salisbury, 1980) to generate actuator control signals in what is called stiffness control; for a review of these methods, see Whitney (1985). More generally, position and force can be related to actuator torque through a second-order dynamic model representing the desired mechanical impedance of the end effector (Hogan, 1985a, 1985b, 1985c).
Although force reflective teleoperators have been in existence for more than four decades (Spooner and Weaver, 1955; Sheridan, 1960; Goertz, 1964), there are still very few successful implementations of complex, high-DOF systems that satisfy what we envision as the ideal system. Although much of the performance limitation is due to physical hardware insufficiencies, great increases in performance can be achieved
through proper design and implementation of the embedded control strategies used.
The ideal performance of a teleoperator has been described as a massless, infinitely stiff mechanical connection between the input device (master), and the effector (slave) (Biggers et al., 1989; Handlykken and Turner, 1980). Augmentation of the operator's sensory and motor skills, such as force magnification or displacement scaling, as well as compensation for environmental effects, such as gravity, are often additional design challenges (Dario, 1992; Flatau, 1973; Hatamura and Morishita, 1990; Hollis et al., 1990; Hunter et al., 1990; Vertut and Coiffet, 1985a). Desired characteristics of the coupled master-slave system include:
low operator input impedance in free space
high intersystem stiffness
high-bandwidth force reflection
stability for a wide range of contact impedances
These characteristics attempt to maximize the "transparency" of the overall system. An optimal system would be indistinguishable from direct operation on the environment itself, without the interposed machinery. However, given nonideal machinery, and given that there is still no clear consensus on an ideal remote manipulator, Sheridan et al. (1989) state that "the ideal manipulator is an adjustable one." These goals, however, have shown themselves to be formidable problems given the realities of limited-bandwidth actuation, limited sensory capability, and time delays in the communications and computation pathways.
The simplest form of teleoperator consists of a remote manipulator that is servo-controlled to follow the operator's position commands. The operator's intended motion is measured through a joystick or similar device. Trajectories for autonomous robot arms are mathematically defined for smoothness so that velocity and acceleration profiles are available as slowly changing inputs to the control system. In contrast, in teleoperation, only the position is measurable at a given time so that velocity and acceleration commands must be estimated and will therefore be noisier. Fortunately, most human movements are relatively smooth (Flash and Hogan, 1985).
The volume of space in which it is comfortable for the human operator to maintain hand position for extended periods is small compared
with the total work volume of the human arm. Also, it is difficult to design hand controllers that can safely cover the entire work volume. As a result, many designs employ a much smaller work volume for the master hand controller than for the slave robot. To effectively use the slave robot then requires a method for selectively changing the offset between master and slave positions, usually referred to as indexing. This is accomplished in most systems with a finger button that momentarily breaks the connection between master and slave, during which time the operator repositions his or her hand. When the button is released, teleoperation resumes with motion increments referenced to the new master position.
The other significant form of control for remote manipulators is resolved rate control (Whitney, 1969). In this mode, human hand displacements are interpreted as a velocity command in an assigned Cartesian frame. This mode is used for example in the space shuttle Remote Manipulator System (RMS). One 3-DOF joystick is used for orientation commands and another for translation. A 6-axis hand controller has also been used as a rate controller for teleoperation (Bejczy et al., 1988). The command frame can be set arbitrarily so that commands can be referenced to the center of gravity of the held object, for example. One important requirement for rate control joysticks is a spring return. The spring return can be implemented with hardware springs or by computer-generated "software spring" force commands to an active joystick (Bejczy et al., 1988). This passive force feedback is essential for easily stopping the commanded motion.
A significant issue concerns which mode is better for teleoperators without force feedback. Kim et al. (1987) found that position control gave better completion times in simulated teleoperation, except for very slow simulated manipulators, for which rate control was slightly better. It is widely thought that, for tasks requiring large displacements, rate control can be superior because it eliminates the need for repeatedly indexing to perform motions larger than the master work volume.
Performance degradation occurs when there are significant rotations between the rate controller's frame and the frame defined by the user's body (Kim et al., 1993). In other words, if left-right hand motion causes end effector motion that is in a visibly different direction, confusion can result. This problem is particularly severe for the control of orientation when rate commands are referenced to rotation axes fixed to the robot hand. Feedback of contact forces in rate control has been tried by many laboratories without success. Parker et al. (1993) developed a novel control law that solved some of the problems by using a deadband. However, much work remains to show true performance benefits. Novel modes of rate control, transitions between rate control and position control, and relative performance between rate control and position control
in teleoperation are unresolved issues that could have major impacts on application design.
Coupled Control of Manipulators for Force-Feedback Teleoperation
When mechanical master-slave manipulators were first made electrically remote, it was realized that force feedback to the master side was essential for good manipulation performance. The earliest systems used identical master and slave devices with decoupled controls of the individual joints in which joint torque for both master and slave was a function of position difference between them (Goertz and Thompson, 1954; Goertz, 1964). As improved computing power became available in the 1970s, it became possible to use kinematically different master and slave devices in which the master could be optimized for interfacing with the human operator, and the slave for its particular task (Bejczy and Salisbury, 1980, 1983). The computer performed the necessary coordinate transformations of the force and motion information and calculated the control laws in real time (Bejczy and Handlykken, 1981).
The details of these coordinate transforms depends on details of the master and slave manipulators. Human operator position is usually sensed indirectly through joint sensors on the master. These joint angle readings must be transformed through the forward kinematics of the master arm to derive the hand's position and orientation. Alternatively, the increments of joint motion can be transformed to Cartesian displacements through the manipulator Jacobian matrix. Similarly, in one popular force reflection architecture, slave motion, controlled and sensed in terms of joint torques, must be transformed into Cartesian coordinates, through the Jacobian transpose inverse of the slave and then into the joint coordinates of the master through the Jacobian transpose of the master. If a wrist-mounted force/torque sensor is used, the first of these transforms in not necessary. As with autonomous robot control, the performance of these operations is very sensitive to the numerical conditioning of the Jacobian matrix of the master and slave manipulators, since matrix inverses appear frequently in the relevant equations. Control methods have been developed for manipulators operating near singular configurations (Nakamura and Hanafusa, 1986; Wampler, 1986) but have not so far been applied to teleoperated robots.
Many teleoperator controllers have since been developed. Most of them have relied on existing position, stiffness, or impedance controllers on the slave and master manipulators (Jansen and Herndon, 1990; Yokokohji and Yoshikawa, 1992; Tachi, 1991; Goldenberg and Bastas, 1990; Strassberg et al., 1992). To link the master and slave and to provide
kinesthetic feedback, these approaches choose one of the interaction variables (force or velocity) to send from master to slave (forward) and send its complement (velocity or force) from slave to master (feedback). A useful general representation of these models was developed by Fukuda et al. (1987), who formulated the controller in matrices that relate all measured variables to all actuated variables.
For practical reasons, many of these studies have been carried out using hardware designed for other purposes, with little capability for delicate force control. Often, for example, an industrial robot manipulator is used for the slave robot. Master manipulators (joysticks) are often highly geared mechanisms with joint limits or kinematic singularities significantly affecting the operator's motion. Mechanical limits are of key importance in determining teleoperator fidelity. Control technology can never fully overcome limitations imposed by friction, bandwidth, and actuator properties.
For example, stiction (static friction) imposes a lower limit on the magnitude of forces that can be displayed; actuator saturation provides an upper limit. Mechanisms capable of handling higher forces often have lower bandwidth and higher friction. Thus dynamic range of forces becomes important. Similarly, torque ripple in actuators will distort force commands to the operator or environment and costly force transducers in a closed-loop mode can only partially compensate. Many current studies are limited by a narrowness of approach in that they study only the control system. Major improvements in teleoperator performance will require attention to the underlying mechanisms as well as their control. This tightly integrated approach between control, actuation, and mechanization has been termed mechatronics.
The mechatronics of teleoperation needs further interdisciplinary study. However, research progress is currently limited by a lack of tools with which to quantify teleoperator performance. Although some work has been done (discussed below), few of the implementation studies have quantified the performance of their design. In the absence of standardized quantitative measures, an event at which researchers could bring their implementations together to allow comparative subjective testing could have a substantial impact on the field.
The performance of a master-slave teleoperator has aspects that have been described qualitatively as "crispness," "viscosity," "stiffness,'' and "bandwidth." In the 1980s it was recognized that a useful analogy could be constructed between teleoperators with force feedback and 2-port electrical networks (Raju, 1988; Hannaford, 1989). This was an extension of
the earlier treatment of robots in contact with their environments as 1-port impedances (Hogan, 1985a, 1985b, 1985c). The 2-port model describes and unifies the above qualitative descriptors into a multidimensional measure of transparency.
The 2-port model can be expressed in several forms by different matrices (impedance matrix, Z, admittance matrix, Y, hybrid matrix, H, and scattering matrix, S). Each of these matrices is a useful way to quantify the performance of force-reflecting manipulators. The elements of the various 2-port matrices are dynamic functions of frequency that quantify the stiffnesses, damping, and dynamic properties of the telemanipulators. Each 2-port matrix is a complete description of the teleoperator in a specific configuration, but each matrix is useful for different types of analysis.
A challenging issue arising in many applications is time delay between master and slave sites. This delay ranges from a few milliseconds in the case of computation delays to 10-100 ms delays induced by computer networks, to delays of seconds or more induced by multiple satellite communication links. These delays can induce loss of teleoperator fidelity and task performance as well as dangerous instability of operation (Hannaford and Kim, 1989; Kim et al., 1992). A breakthrough in this area occurred when Anderson and Spong (1988, 1989) developed an approach to making the two-way information channel between master and slave satisfy a passivity constraint regardless of time delay. Passivity quantifies the total energy output of a system and has been used before as a test for stability (Wen, 1988). With Anderson and Spong's approach used for the communication channel, if the master and slave controllers could be shown to be passive, then the system as a whole can be guaranteed passive and thus stable. This analysis assumes as well that the human operator and environment are passive.
The assumptions of passivity for the operator and environment are well accepted and appear to be consistent with the everyday experience of mechanical manipulators that are guaranteed passive and always stable under human control and actuation. Niemeyer and Slotine (1991) reformulated the passivity approach and explicitly normalized the relation between force and velocity by introducing the characteristic impedance. This work corrected a detail omitted by Anderson and Spong, who implicitly used a value of 1 for the characteristic impedance. Although passivity is an elegant method to prove stability, it allows no way to assess performance. Recent results (Lawn and Hannaford, 1993) have shown that the characteristic impedance of the simulated passive transmission
line induces a trade-off between the series compliance from master to slave and the free motion damping. One or the other must be increased in proportion to time delay, which induces a signficant performance penalty. Therefore, the shortcoming of the passivity-transmission line theory is that performance of the system is not addressed.
Recently, robust control concepts have been applied to teleoperator control, starting with earlier work in impedance control (Colgate and Hogan, 1988) and continuing by applying H-infinity optimization of control over a specified bandwidth (Andriot and Fournier, 1991; Kazerooni et al., 1993). This work allows the designer to optimize a controller to minimize distance measures between the 2-port model of the teleoperator and ideal transparency and passivity. This work seems especially promising since it is aimed at simultaneously achieving performance and stability for all loads over a specified bandwidth.
An important extension of force-reflecting teleoperation is the idea of the scaling of mechanical energy between master and slave (Feynman, 1960; Flatau, 1973; Hannaford, 1990, 1991; Colgate, 1991). Recent implementations of this idea have encompassed an astonishing 1010 range of scale variations from 109 reduction of the operator's motion between powered joysticks and scanning tunneling microscopes (Hollis et al., 1990; Hatamura and Morishita, 1990), to approximately 101 increasing of the human motion in the control of large robots, such as construction or demolition manipulators. In between are applications such as muscle fiber manipulation (Hunter et al., 1990) and microsurgery (Colgate, 1991).
An important issue arises with scaling because of the scaling properties of dynamical systems. If an object is uniformly scaled in linear dimension by L, and some assumptions are made, then the inertia (which depends on volume) can be expected to scale with the cube of L, the friction (which depends on surface area) can be expected to scale with L2, and the stiffness with L. This nonlinear scaling of the dynamic parameters results in qualitative changes in dynamic behaviors, such as natural frequency and damping ratio as size is changed. For example, if size is scaled down, natural frequencies will rise. Approaches to correcting this change in dynamics include time domain manipulations and frequency domain approaches (Colgate, 1991). Much research needs to be done in this area since the objects we will manipulate in the micro domain clearly will not be scaled versions of the objects in our own world.
Finally, the performance of force-reflecting teleoperators is more difficult to assess than in unilateral systems with statically defined inputs and outputs. True system performance depends on visual display quality and manipulator and hand controller capabilities, in addition to the low-level controller's performance. The presence of the human operator introduces a source of variability that must be treated statistically at great cost in experiment time.
There have been a wide variety of measures used to characterize operator performance. Many of these measures, such as time of task completion, accuracy, and error, are similar to those used by production-line industrial engineers and motor-skill psychologists (Sheridan, 1992a). Other measures include Fitts information processing rate (Hill, 1979), peak force, variance in force, and sum of squared forces (Hannaford et al., 1989, 1991). Classes of tasks to which these measures have been applied include:
Calibration tasks, such as tracking a specified path;
Elementary tasks, such as stacking blocks, putting pegs in holes, and threading nuts on bolts; and
Actual tasks, which depend on an application such as assembly of components of a machine.
In many cases, the measures are experiment-dependent and, due to the large variety of experiments humans have been subjected to, there is a large variety of measures. Even in the performance measurement of similar tasks with similar variables sensed, there is a lack of a standard metric (Sheridan, 1992a). For example, a variety of metrics that are a function of the force/torque of interaction have been used to characterize operator performance (Das et al., 1992; Hannaford et al., 1991; McLean et al., 1994). In addition, the tasks themselves need to be standardized so that comparisons can be made between alternative studies.
Related to the measure of human performance from experimental data is the analysis of the data. Statistical evaluation of data in teleoperation experiments has lacked standardization. Often, visual inspection of the means and standard deviations of datasets is used to make conclusions. There are a number of statistical methods available for the processing of data, and other fields of experimental research have adopted standards. It is encouraging to note that this trend is occurring in human performance studies in teleoperation.
In terms of published results, Kim et al. (1987) compared position control versus rate control, taking into account the joystick type (isotonic or isometric), display mode (pursuit or compensatory), three-dimensional
task (pick-and-place or tracking), and manipulator workspace. They found that, regardless of joystick type, display mode, or task, when the workspace is small, position control is superior to rate control, by measures of completion time and accuracy. For larger workspaces or slow manipulators, rate control becomes superior to position control. Das et al. (1992) also found that position control is superior to rate control. In contrast, Zhai and Milgram (1993a) examined a six-dimensional task and concluded that isometric rate control can be as good as isotonic position control.
Isometric (pure force) versus isotonic (pure position) control are the extreme ends of a continuum of variable-stiffness hand controllers: infinitely stiff versus infinitely pliant. For intermediate stiffnesses, Zhai and Milgram (1993b) found that elastic rate controllers are better than isometric rate controllers, especially for less well rehearsed tasks. Mention should be made again of a comparable study by Jones and Hunter (1990), first discussed in Chapter 1, who found that increasing the stiffness of a manipulandum decreases the response time. A contrary effect is that high stiffnesses decrease accuracy, and there is an optimal value of low stiffness for best accuracy. More recently, they found that increasing the viscosity of the manipulandum decreases the delay and the natural frequency of the human operator (Jones and Hunter, 1993).
The effect of other mechanical properties of manipulanda, i.e., bandwidth, Coulomb friction, and backlash, were investigated by Book and Hannema (1980) using a Fitts' Law paradigm. In this paradigm, a motion with accuracy constraints is segregated into a gross motion (approach to a target) and a fine motion (accurately attaining the target). Both Coulomb friction and backlash increased the fine motion time but kept the gross motion time unchanged; backlash was the hardest to handle. Decreasing bandwidth increased the gross motion and fine motion times about the same.
Another issue is force reflection versus position control for teleoperators. In general, it is found that force reflection is significantly better than position control (Das et al., 1992; Hannaford et al., 1991).
The term supervisory control derives from the analogy between a supervisor's interaction with subordinate human staff members in an all-human organization and a person's interaction with intelligent automated subsystems. A supervisor of humans gives directives that are understood and translated into detailed actions by staff subordinates. In turn, subordinates collect detailed information about results and present it in summary form to the supervisor, who must then infer the state of the system
and make decisions for further action. The intelligence of the subordinates determines how involved their supervisor becomes in the process. Automation and semi-intelligent subsystems permit the same sort of interaction to occur between a human supervisor and the computer-mediated process (Sheridan and Hennessy, 1984; Sheridan, 1992a).
In the strictest sense, supervisory control means that one or more human operators are intermittently programming and communicating to the computer information about goals, constraints, plans, contingencies, assumptions, suggestions, and orders relative to a remote task, getting back integrated information about accomplishments, difficulties, and concerns and (as requested) raw sensory data. In addition, the operator continually receives feedback from a computer that itself closes an autonomous control loop through artificial effectors and sensors to the controlled process or task environment.
In a less strict sense, supervisory control means that one or more human operators are continually programming and receiving information from a computer that interconnects through artificial effectors and sensors to the controlled process or task environment, even though that computer does not itself close an automatic control loop. The strict and not-strict forms of supervisory control may appear the same to the supervisor, since he or she always sees and acts through the computer (analogous to a staff) and therefore may not know whether the computer is acting in an open-loop or a closed-loop manner in its fine behavior.
Once the supervisor turns control over to the computer, the computer executes its stored program and acts on new information from its sensors independently of the human, at least for short periods of time. The human may remain as a supervisor or may from time to time assume direct control (this is called traded control), or may act as supervisor with respect to control of some variables and direct controller with respect to other variables (called shared control).
Supervisory control has been applied wherever some automation is useful, but the task is too unpredictable to trust it to 100 percent automation. This includes essentially all of what are called "robot" applications currently. Among the reasons to employ supervisory control, are:
improved task performance (both speed and accuracy) and, in the special case of loop time delay, and avoidance of instability.
human safety, when the work environment is hazardous.
convenience, when the human must attend to other tasks while the automation is working and assuming the task does not require continuous monitoring.
a means to construct, control, and continuously modify intelligent systems, so as to better appreciate the relation of people to machines.
Generic Paradigm for Supervisory Control
Figure 9-2 illustrates the generic supervisory control paradigm. At the bottom the tasks represent various material processing machines and transfer devices, each controlled by its own computer. All the computers are controlled from a central control station in which the human supervisor cooperates with a computer to coordinate the control of the multiple automatic subsystems. The supervisor's functions are five, which in turn can be subdivided as shown in the upper part of the diagram by the upper case labels. For each of the main supervisory functions, the computer provides decision-aiding and implementation capabilities, as represented by the separate blocks. These five functions are:
Plan, consisting of (a) modeling the physical system or process to be controlled, (b) deciding on the objective function or trading relation among the various goal states that might be sought, and (c) formulating a strategy, which consists of scheduling and devising a nominal mission profile.
Teach, which means (a) selecting the control action to best achieve the desired goal and (b) selecting and executing the commands to the lower-level computers to make this happen.
Monitor, which means (a) allocating attention appropriately among the various subsystems to measure salient state variables, (b) estimating the current state (arriving at the current belief state), considering in addition to measurements the predicted response based on the previous belief state and the previous control actions, and (c) detecting and diagnosing an abnormality.
Intervene, which means (a) to make minor parameter adjustments to the automatic control as it continues in effect, (b) to reprogram if the system has come to a normal stopping point and is awaiting further instruction, (c) to take over manual control if there has been a failure of the automation, and (d) to abort the process in the case of a major failure.
Learn from experience so as to do better next time.
Status of Research in Supervisory Control
Computer-Based Planning of Telerobot Actions
Many new computer-based aids for planning (for collision avoidance, energy minimization, and other criteria of satisfaction) operate "what-would-happen-if—" trials on system models with hypothetical inputs. Typically, these provide the capability for graphical entry of test commands or trajectories and indicate the projected quality of the results. The recent system of Park (1991) is an example. Park's system allows the
human operator to indicate, on a computer graphic model of the environment (as best it is known), a series of subgoal positions to serve as intermediate points along a multiple-straight-line-segment path for motion of the end-point of the teleoperator. The computer graphic model provides depth cues in the form of lines from the underside of the object to the floor as described above. Another algorithm checks for collisions with the environment by any part of the teleoperator or objects carried by it (again, as best known). As necessary, the algorithm makes small modifications to avoid such collisions. Areas that cannot be seen, or are not known from other modeling information, are considered virtual objects. If an initial trajectory turns out to be unsatisfactory, the operator can try another. The operator can thus evaluate a trajectory before ever committing to an actual motion. If the video camera is moved for a better view, the computer can reduce the virtual objects to more closely conform to actual objects. If, upon actual motion, collisions occur, the model can be updated with the operator's assistance.
Teaching the Telerobot What to Do
Machida et al. (1988) demonstrated a technique by which commands from a master-slave manipulator can be edited much as one edits material on a video tape recorder or a word processor. Once a continuous sequence of movements had been recorded, it can be played back either forward or in reverse at any time rate. It could be interrupted for overwrite or insert operations. Their experimental system also incorporated computer-based checks for mechanical interference between the robot arm and the environment.
Funda et al. (1992) extended the Machida et al. work and the earlier supervisory programming ideas in what they call teleprogramming. Again, the operator programs by kinesthetic demonstration as well as visual interactions with a (virtual) computer simulation. That is, commands to the telerobot are generated by moving the teleoperator master while getting both force and visual feedback from a computer-based model slave. However, a key feature of their work is that instructions to be communicated to the telerobot are automatically determined and coded in a more compact form than record and playback of analog signals. Several free-space motions and several contact, sliding, and pivoting motions, which constitute the terms of the language, are generated by automatic parsing and interpreting of kinesthetic command strings relative to the model. These are then sent on as instruction packets to the remote slave. The Funda et al. technique also provides for error handling. When errors in execution are detected at the slave site (e.g., because of operator error, discrepancies between the model and real situation and/or the coarseness
of command reticulation), information is sent back to help update the simulation. This is to represent the error condition to the operator and allow him or her to more easily see and feel what to do to correct the situation.
Computer Assistance in Monitoring
There are various ways the computer can assist the human supervisor in monitoring the effectiveness of the automation, getting a better viewpoint and detecting and diagnosing any abnormality that may occur.
Das (1989) devised an algorithm for providing a "best" view for teleoperator control within a virtual model. His technique computes the intersection of the perpendicular bisector planes to lines from the end effector to the immediate goal and to the two nearest obstacles. Such a point provides an orthogonal projection of the distances to the obstacles and to the goal. In evaluation experiments it proved useful to have the viewpoint thus set automatically, unless the operator needed to change it often, in which case the operator preferred to set it.
Hayati et al. (1992) and Kim et al. (1993) report a computer-aided scheme for performing telerobotic inspection of remote surfaces. The scheme involves both automation in scanning and capture of video images that appear to reveal flaws as well as graphic overlays and image maneuvering under human control.
Tsach et al. (1983) report a technique for automatically scanning and detecting a discrepancy in dynamic input-output relations relative to a multisubsystem normative model. Actual process measurements of upstream variables are fed to the appropriate model subsystems and corresponding downstream measurements are compared with outputs of the model. The advantages of their technique is that failures can be determined even during process transients, and failure locations can be isolated.
Intervening and Learning
If the automation fails, if the programmed actions end, or for other reasons, occasionally the human supervisor must intervene to reprogram or take over manual control. Criteria for doing this and which takeover mode is best tend to be context dependent.
Supervisory learning can be accomplished by keeping track of conditional frequencies (probabilities) and weighting by outcomes, by steepest ascent methods, by building up fuzzy rules and "growing an expert system" in terms of given fuzzy linguistic variables, calling for more evidence in state space regions in which membership is poorest, or by more
rigorous neural net techniques. There is a dearth of research in both the intervening and learning roles of the supervisor; the reasons are elusive.
Predictor Displays to Cope with Time Delay
Control engineers are familiar with the destabilizing effect of loop time delays that occur in space teleoperation due to the limited speed of electromagnetic radiation, or similarly in underwater teleoperation when sonic communication is used (Sheridan, 1993). Predictor displays have been demonstrated (Noyes, 1984; Hashimoto et al., 1986; Cheng, 1991) to reduce task execution time by as much as 50 percent under ideal conditions, namely when: (a) the kinematics and dynamics of the object being controlled can be reasonably modeled; (b) there are minimal interactions with external uncontrollable objects or disturbances that cannot be modeled; (c) movements of the real object can be superimposed on the model movements and both can be seen in the same image plane; and (d) actions occur at a pace slower than human movement and perception. Condition (b) does not obtain in assembly tasks in which external objects are small enough to be moved. However, much as an athlete apparently does in running over rough ground, catching a ball, etc., it is possible to build and calibrate a model of the salient external objects by vision or optical proximity sensing and to use it to predict interaction with the end effector or object under direct control.
Although the dynamics of the computer model used for prediction and planning must be synchronized to the dynamics of the actual task, the use of that model by the human operator need not be synchronized. In other words, an easy maneuver in free space may require no attention by the operator, and as that easy maneuver is occurring he or she may wish to focus attention on a more complex maneuver coming up later, even running the prediction of the complex maneuver in slow time or repeating it. Conway et al. (1987) called this ''disengaging time control synchrony using a time-clutch" and "disengaging space control synchrony using a position-clutch." They built and demonstrated such a system.
Visual and Haptic Aids
Predictor displays may be considered to provide visual and haptic aids to the operator. Whether or not there are delays, other forms of visual and haptic aids may enhance operator performance. Brooks and Ince (1992) suggest a variety of display enhancements:
In zone displays, areas of the visual field are highlighted that place restrictions on the operator, such as slave reachability (due to its own
kinematics and to limits in the hand controller range) and imminent collisions. The reachability limits of the master can be changed by reindexing.
Depth cues can be provided by superimposing a perspective grid with markers dropped onto the grid from the end effector and object workpoint, by virtual placement of the camera in a more useful location or by superimposing graphs representing end effector position and orientation relative to the task.
View enhancements include delineating edges and surfaces that may be poorly visible due to lighting or similar textures, image distortions to highlight positioning errors, and status indication, such as coloring a gripper depending on the grasp stability and hot spots in the environment sensed with a temperature probe on the end effector.
Milgram et al. (1993) also propose virtual pointers to mark points in displays, virtual tape measures to show the distance between points, and virtual tethers between slave endpoint and target point to help guide a motion.
An issue is registration of the graphical image with the real image (Oyama et al., 1993). In addition, the graphical simulation has to be periodically corrected to correspond to the real display because of modeling errors; this process has been called stimulus-response reconciliation (Brooks and Ince, 1992). Rasmussen (1992) emphasized the importance of aligning the visual and kinesthetic reference frames for best performance; concern was expressed for laparascopic procedures in which vision via an overhead monitor is badly aligned with manipulation (Tendick et al., 1993).
Haptic aids are not as well developed as visual aids. Sayers and Paul (1993) propose the concept of synthetic fixtures, whereby parts mating operations are facilitated by force attractors. Similarly, force-reflecting mice for graphical interfaces have been proposed to mechanically enforce window boundaries, simulate push buttons, and act as attractors (Hannaford and Szakaly, 1992; Kelley and Salcudean, 1994; Ramstein and Hayward, 1994).
A hypothetical scenario for the use of predictor displays and sensory aids is presented in Chapter 12 for hazardous operations.
Performance Measurement, Learning, and Modeling
Any task performed under supervisory control requires time for the human to program (teach) the operation and then time to monitor while the operation is being executed by the computer. Each of these components takes more time as the complexity of the task increases. For very simple tasks, one might expect direct control to be quicker because instruction of a machine, as with that of another person, requires some
minimum time. For very complex tasks, instructing the computer is likely to be extremely difficult and therefore supervisory control is essentially impossible. In between, as has been shown, the sum of the two times can be significantly less than that for direct human control.
There have been models of supervisory control, but none has captured all of the elements, including those of planning and setting objectives. The Baron et al. (1980) PROCRU model includes continuous sensorimotor skill, attention sharing among displays, decision making, and procedure following and has been applied successfully, for example, to the final approach of aircraft.
In a virtual environment or teleoperation system, the user senses a synthetic environment through visual, auditory, and haptic displays and then controls the synthetic environment or telerobot through a haptic or auditory interface. These input/output (I/O) operations must be sufficiently fast to be effective, that is to say, for control of a device or for presentation of human stimuli. How to organize a computer system to handle the computational demands of the many components of a VE or teleoperation system, especially those of the I/O operations, is a challenge for real-time computing.
At the most basic level, computers simply have to be faster and cheaper to be used to compute the necessary algorithms in real time. Such requirements have already been discussed for graphic displays, auditory displays, and information visualization. A more particular requirement is for real-time I/O: often powerful computer systems can compute fast, but they are unable to perform real-time I/O for control and sensing of external devices. For example, the elaborate operating systems of workstations typically do not permit fast and reliable interaction with the external world. Given the diverse components of a VE or teleoperation system, a decentralized computing architecture with an individual computer attending to just one component, such as a haptic interface, seems more appropriate than a monolithic supercomputer. Software and operating systems have to facilitate the programming and interaction of such networks of computers, walking a delicate line between efficiency and features.
Since World War II, real-time computing has been particularly associated with the development of control systems, simulators, and input-display systems, which are the three primary ingredients of VE and teleoperation systems. Many approaches and issues have been addressed in robotics and telerobotics. In fact, telerobotics puts greater demands on real-time computing than does robotics, because, in addition to the control
of a robot, sampling and control of the human interface and the real-time computation of a predictive display are required. That VE and teleoperation systems must comply with the requirements of the human beings with whom they interact puts lower bounds on any measure of performance.
VE and teleoperation systems might include a head-mounted visual display, eye trackers, various limb trackers, haptic interfaces, sound generators, and remotely controlled robots. Each of these components may itself be a complicated system; for example, a telerobot can have many sensors and actuators and complicated end effectors. Not counting the visual data, it can be estimated that such system should be able to sustain, at the bottom layer of a real-time control hierarchy, several hundreds of thousands of transactions per second with latencies measured in a few microseconds. These numbers may not seem so high by today's standards in terms of communication systems, but, unfortunately, because of the short latency requirements (ideally a few tens of microseconds for a rate of a thousand of samples per second) currently commercially available equipment either barely can keep up or simply is inadequate, because often this equipment is designed for quite different purposes.
Levels of Control
A telerobot requires a hierarchical control system, with servos and higher-level control operating at several levels. A force-reflecting haptic display or a locomotion display is also essentially a robot with similar control system requirements. The real-time requirements vary considerably, depending on the level of the hierarchy in question. At the lower levels, speed, precision, responsiveness, and predictability are the features that are sought from the real-time control; at the higher levels, flexibility will be the key, along with facilities normally provided by high-level operating systems.
At the lowest level is joint control. In the simplest case, the control design may be approached by considering each joint of a complex robot (or haptic interface) as a single-input/single-output system, with joint torque as input and joint position as output. Manipulators and haptic interfaces are made of a collection of actuated and passive joints organized into a kinematic structure. Position, force, and relationships between these quantities and their derivatives for the entire manipulator or
haptic interface will then become the variables of interest. From a systems perspective, the manipulator or the haptic interface will then be viewed as a multiple-input/multiple-output system, as seen from a higher level in a control hierarchy.
For a medium-size manipulator, servo rates will be on the order of a fraction of to several thousand samples per seconds, which translates into 50 kflops to 5 Mflops, depending on the control algorithm (Ahmad, 1988). Feed-forward dynamic compensation, for medium-size manipulators, can be used effectively with rates as low as 20 updates per second; nevertheless, the computational burden is significant (Hollerbach, 1982; Izaguirre et al., 1992). For micro manipulators, these numbers are much higher: on the order of Gflops in extreme cases (Yoshichida and Michitaka, 1993).
Intermediate Level Control
Higher still in the hierarchy, not only will the state of the system at one instant in time be considered, but also entire state trajectories. The calculation of these trajectories will in turn require an additional computational demand, which depends on the number and the nature of the constraints that need to be enforced. As a general rule, each control system at a level of a control hierarchy must handle subsystems, which grow more complex as we move up the control hierarchy, but if the design is correct, their complexities will be hidden by the lower levels.
At the higher levels, the control of the system is described in terms of general directives that cover a complete task or subtask. Here the time scale is measured in seconds, minutes, or hours. At such a level, the real-time control requirements are described only in general terms. The higher up in the hierarchy, the more the features offered by conventional operating systems found on engineering workstations will supply the required services for VE and teleoperation systems. The requirements are expressed in terms of powerful programming environments, languages, and ordinary system services, such as data storage and retrieval (Hayward and Paul, 1984).
Latency Versus Update Period
The latency is affected not only by calculation time, but also by input/output transactions between computing units and peripheral devices. These operations may include analog to digital conversion, data transfer from peripheral devices, data formatting and conversion, safety checks, and so forth.
The capital rule of digital control systems is the minimization of latency, defined as the time elapsed between the time a measurement is taken, the relevant calculations are performed, and the signal is fed back to the system to be controlled (in general, to the actuators). In fact, it can be readily realized that a digital control system with a latency larger than the basic response time of the system being controlled will fail at its task: at each update, the new control signals will be based on measurements made in the past, thus reflecting a state of the system that no longer is relevant.
The update period is simply the time between two successive outputs of control signals. Notice that parallel processing can always be used to achieve a very small update period, but the effect on latency is limited by the minimum number of sequential operations that have to be performed on the inputs to calculate the outputs (Lathrop, 1985). An experimental study (Rizzi et al., 1992) demonstrates the effect of a 40-fold increase in either latency or update period on the performance of a tracking controller. The increase in update period from 1 ms to 40 ms has a very minor degrading effect on tracking performance and stability. The 40-fold increase in latency completely destabilizes the system.
One way to counteract a large latency is for the control system to predict future states of the system based on previous measurements. Of course, precise prediction of the state is computationally demanding, and this will result in even larger control latency, possibly leading to worse performance than a simpler controller well matched to the response of the controlled system. The conclusion is: if latency can be avoided, it must be—even if it means simpler control algorithms.
For low-level control, latency requirements for manipulators and haptic interfaces are identical to the sampling period, which is of the order of a few tens to hundreds of microseconds. For intermediate-level control, latency requirements range from a few milliseconds to a few seconds. For high-level control, latency requirement can vary from a fraction of a second to a few seconds.
By system architecture for real-time computing is meant the type and number of processors, the connectivity and communications between processors, the input/output interfaces, the operating systems, and the development tools.
Processors and Computing Platforms
Major processor architectures can be classified as CISC (complex instruction set computers), RISC (reduced instruction set computers), VLIW
(very long instruction word), Superscalar, and DSP (digital signal processors). Table 9-1 lists some of the main commercial examples. At the moment, certain of these processors are more prevalent and dominant than others; Juliussen (1994) discusses their relative strengths and future commercial viability. Some of the CISC and RISC processors have been designed specifically for personal computers (PCs), such as the 486s and the Pentium; others are incorporated into both PCs and workstations, such as the Alpha, the PowerPC, and MIPS. Processors such as Transputers and the Texas Instruments TMS320C40 (C40) DSP chips are not meant for general-purpose computing platforms, but for fast numerical computation and I/O. The scenario of some general-purpose computing platform as a front end to a network of Transputers or C40s that do the I/O and real-time computing has become popular in robotics and may well serve as an example for real-time VE computing as well.
There is a tendency to discuss only workstations for VE and teleoperation systems, primarily because of graphics and networking, but PCs are approaching workstations in power and are already heavily used in real-time computing. The outcome of the much-discussed battle, or convergence, between PCs and workstations will have a major impact on the nature of real-time systems in general and on VE and teleoperation systems in particular. Because of the generally lower costs of PCs and associated software, their development will facilitate the spread of VE and teleoperation systems.
In turn, the development of PCs is threatened from below by the advances of more powerful video game players and set-top interactive TV players. Planned game players from Sega and Nintendo will soon have comparable computational power to the fastest desktop computers. The addition of typical PC applications software such as spreadsheets, word processing, and electronic mail are likely in the near term and will convert such TV players into more general-purpose computers. With the combination of video, sound, input devices, and fast computation, these TV players could represent a formidable computational force in VE. Given
TABLE 9-1 Categorization of Some Advanced Processors
AMD Bit Slice
the huge market forces driving these developments, costs of such systems would run at several hundred dollars and severely threaten PC- or work-station-based VE systems.
Parallel Processing and Communications
To run processes faster and meet real-time constraints, parallel processing can be advantageous up to a point. That point is defined by losses in throughput and increases in latency due to communication, and by limitations in the ability to parallelize a computation. When there are 10 or more processors to coordinate, it can become a formidable task to avoid conflict and to ensure timely performance. In such cases, it may be best to upgrade to a faster processor to reduce the number required.
The main approaches toward connecting processors include a common bus, point-to-point links, and crossbars. A common bus is the most traditional approach; industry standards with a roughly equivalent performance of 40 to 80 Mbytes/s include VME Bus, MultiBus II, FutureBus, EISA, NuBus, SBUS, and PCI Bus. Each computing unit is equipped with fast local memory, and a global memory bank is made available to all computing units (Bejczy, 1987; Chen et al., 1986; Clark, 1989; Narasimham et al., 1988). The advantages of such an architecture include modularity, easy expansion for more performance, and the possibility to mix hardware from different sources (Backes et al., 1989). The disadvantages obviously lie in the bottleneck created by the two shared resources: the global memory and the system bus.
Point-to-point links circumvent the bottleneck problem of bus-based architectures to some extent: communication between connected points will be fast, but that between distant nodes in a graph will be slow. Such built-in links are used in most engineering workstations for graphic processors, sound processors, memory access, disk controllers, etc. The advantages include design for minimum cost, but then the system, once configured, offers few possibilities for expansion.
Some developers have proposed high-performance central processing units (CPUs) equipped with built-in high-speed point-to-point links (1-20 Mbytes/s). Following this philosophy, Inmos from Europe (now a division of SGS-Thomson) introduced the Transputers series (T800, T9000), which can loosely be classified as RISC computers. These computers are each fitted with four communication channels, which permit the user to design a variety of coarsely parallel architectures (Buehler et al., 1989; Zai et al., 1992). Recently Texas Instruments combined the Transputer concept with its experience in DSP design leading to the C40 with six high-speed links, thus competing head-on with the latest Inmos Transputer T9000. The principal disadvantage of point-to-point communication
systems, such as those made available with Transputers or C40s, is the lack of industry standards in terms of signal conversion and data acquisition hardware. Whereas such hardware is abundantly available for most bus standards, it is only beginning to become so for point-to-point communication systems, requiring for system implementers much custom electronic design.
The recent availability of these powerful processors, which lend themselves to parallel processing, together with commercially available development and run-time software, has virtually eliminated the need to build custom computing platforms and develop custom multiprocessing environments. In the past, this has been necessary and has consumed substantial financial investment and personnel. Examples are the CHIMERA II (Stewart et al., 1989) and Condor (Narasimham et al., 1988) systems. Even though most of these custom systems relied increasingly on standard commercial boards, the efforts in upgrading, to stay compatible with the host operating system as well as the latest run-time boards, constituted a major ongoing investment that fewer and fewer labs are willing to undertake. The same situation of the availability of powerful processors applies to VE and teleoperation systems. This is very fortunate, since it means that all the custom development efforts can be focused on the critical I/O interface, a current bottleneck that may be better served by commercial suppliers in the future.
The most generic architecture consists of a full N × M crossbar switching network connecting a set of CPUs and register files totalling N inputs and M outputs. This architecture has the advantage of being capable of reaching the theoretical optimum performance, which minimizes latency and maximizes throughput. An example is a system for control of a complete microrobot system (Hunter et al., 1990; Nielson et al., 1989). The disadvantage is the massive complexity and high cost of the system.
Although considerably slower, it is also possible for processors to communicate across networks. Researchers have become interested in running robotic experiments or virtual environment setups across a computer network with nodes located in different cities or even in different continents. In one example, robotic sessions are run among a network of sites across the United States (Graves et al., 1994); in another, a telerobotic experiment has been performed between Japan and the United States (Mitsuishi et al., 1992). This type of work is bound to be increasingly investigated given the high potential for applications. In effect, with such systems, the services of specialists in any area could be requested and put to use without asking the specialists to travel where they are needed. Clearly, the demand on communication networks will follow the same route that is being taken by low-level real-time control systems: high volumes of data transactions, low latency and time precision. It must be
expected that the performance levels required at the scale of a laboratory site or an application site will need to become available across entire networks.
Operating Systems and Development Environments
A large number of real-time operating systems are currently in competition to provide the services needed to implement high-performance real-time applications (Gopinath and Thakkar, 1990). These operating systems (OS) may be classified as embedded, full-featured real-time, or hybrid.
An embedded operating system is targeted at original equipment manufacturer (OEM) applications, and as such provides only essential facilities, like a real-time executive (real-time multitasking, memory management, and device I/O). It is usually a simplified system offering no support for a file system and must be used as a cross-development tool. Yet most offer hard real-time features essential to VE and teleoperation systems. In this category, SPOX (Spectron Microsystems) has emerged as one of the industry standards.
A full-featured real-time OS resembles the more standard UNIX but with additional real-time features. An example of this category is HELIOS by Distributed Software Ltd., which offers an eight-level priority real-time scheduler, virtual message routing, X and Microsoft Windows graphic support, as well as SUN and PC host interfaces. Clearly, processing nodes with microsecond interrupt latency requirements can—and must—be stripped of the inherently slow high-level functionality.
In between embedded and full-featured real-time operating systems are the hybrids, the most popular of which is probably VxWorks by Wind River Systems. These typically are real-time operating systems running on dedicated shared-memory VME processor boards and provide a transparent interface to the host UNIX operating system. Despite its high cost ($10,000-20,000) VxWorks has been attractive for its convenient interface to existing UNIX hosts, debugging and development facilities, and the availability of a large number of supported processor and I/O VME cards.
Nevertheless, many researchers are quite dissatisfied with the services provided by commercial operating systems, usually for reasons of cost and inefficiency. Because these operating systems are designed with general purpose requirements in mind, the few requirements essential to VE and teleoperation systems are not well addressed, and the resulting systems are cumbersome to use and clumsy in their design. For these reasons, countless custom operating and development systems have been designed and implemented, with their scope limited to one or to a handful of laboratories.
Experience shows that no single operating system (and no single CPU technology) will satisfy the varied needs of VE and teleoperation systems. This is why real-time extensions to the UNIX systems will most likely remain limited to the higher levels of the hierarchy (soft real time). For low-level control and data acquisition, no single run-time environment has been shown to be satisfactory to the community. HELIOS is probably one of the most suitable operating systems for VE and teleoperation systems to date. It provides all the UNIX-like services on a distributed real-time network, allowing one at the same time to ''peel away" the higher levels of the operating system on part of the network dedicated to time-critical code. In practice, the response is mixed: some users report sufficient control at the device level, some others don't (Poussart, personal communication). Systems like HELIOS are clearly going in the right direction and might be the approach of choice for VE and teleoperation systems in the future. To obtain full processor control on a network of only a few processors, enhanced C compilers, like 3L C by 3L Ltd. are minimal systems, adequate for providing standard host I/O, memory management functions, and multiprocessing facilities (network loading, communication management, load balancing, message routing, micro-kernel for multitasking).
A major trend in telerobotics is toward higher-level supervisory control, that is to say, toward autonomous robotics (robotics for short). Major drivers in this direction include: (1) communication problems (e.g., delays, low bandwidth, low resolution), especially in space and undersea, (2) burden lifting from the operator (e.g., partially automating repetitive tasks), and (3) performance enhancements (e.g., using local sensing when vision is obscured, assisting in obstacle avoidance). Hence the needs in telerobotics are often the same as the needs in robotics. The line between supervisory control and robotics is quite blurred.
At the same time, manually controlled telerobots remain important. One reason is safety, particularly in space and medicine applications, due to uneasiness about the loss of control. For example, developers of the planned space station are concerned that the telerobot not damage the space station structure. Another reason is the inability of robots to handle unstructured environments, which obviates the possibility of even partially automating a task—the reason for having telerobots in the first place.
In the control of telerobots, the issues of haptic interfaces, computer-generated environments, and real-time systems are important. Although predominantly covered in other sections of the report, these issues are addressed here to some extent in discussing various needs in telerobotics.
Handling Communication Delays
In space applications, the sentiment for ground-based teleoperation is increasing, for a number of reasons. First, for the planned space station, the actual amount of time spent by astronauts in orbit will be relatively short, and of that time the fraction that could be devoted to teleoperation is shorter yet. Because internal vehicular robots (IVRs) and external vehicular robots (EVRs) will be present the whole time, they would be used more efficiently if operated from the ground when astronauts are not attending. Second, humans are much less effective in space than on the ground; weightlessness seriously affects concentration and attention. Third, sending humans into space is expensive and dangerous.
To communicate from the ground, several satellite and computer systems must be traversed. The result is variable delays on the order of 5 s. There is also a need to funnel commands through a central computer at the Johnson Space Flight Center, not only to share the communication channel but also to verify the commands. Again, safety is an overriding concern. The trend is to devolve ground control from Houston to other sites; for example, the recent ROTEX experiment (Brunner et al., 1993; Hirzinger et al., 1993) involved control of this German-built telerobot on the space shuttle from Karlsruhe. The Canadian Space Agency is interested in controlling Canadian-built telerobots from its main facility in St. Hubert, Quebec. Other groups will wish to perform scientific experiments in space from their home locations on the ground; the word telescience has been coined to describe this activity. This devolution will accentuate the delays.
Time delays are also important in underseas teleoperation. To avoid power and data tethers, communication with submersibles must rely on relatively slow sonar signals. For example, at a distance of 1,700 m, sound transmission imposes a round-trip delay of 2 s (Sheridan, 1992a).
There are two main approaches to handling teleoperation with time delays: (1) control theory approaches that incorporate time delays and (2) predictive displays. In general, the greater the delays, the lower the system gains may be (e.g., stiffness and viscosity) in order to avoid instability. Thus the system response slows down and performance suffers, to an extent that depends on the controller. Recently developed controllers based on passivity approaches guarantee stability but seem to suffer in performance because their gains are too low (Lawn and Hannaford, 1993). Exactly how to formulate a controller that addresses both stability and performance, under various time delays, needs further work.
Force reflection under delays is more problematic than position control. For delays approaching a second or more, position control becomes superior to force control (Lawn and Hannaford, 1993). In fact, humans
are sensitive to extremely small delays in terms of fidelity of force reflection (Biggers et al., 1989). One solution is to implement a force controller, such as impedance control, on the slave side and a position controller on the master side. Thus there is no force feedback to the operator, and the slave force controller is presumed capable enough to handle interaction forces. This requires that the slave force controller be appropriately tuned from a knowledge of the environment and the task. This knowledge can either exist from a priori information about the task or from measurements made on the environment. It is therefore important to get several kinds of information: (1) determine whether to use position control or force reflection as a function of delay and tasks; (2) determine how to configure the slave manipulator's force controller as a function of a priori knowledge of the task; and (3) identify the environmental characteristics through slave robot sensing, if a priori knowledge is not available. The last two needs are essentially robotics problems.
For delays greater than 1-2 s, direct manual control becomes ineffective without the aid of predictive displays (Sheridan, 1992a). The main issue then becomes how well the remote manipulator and environment are simulated. For known structured environments, good simulations of the geometrical aspects are quite feasible. Force control can also be simulated, but difficulties are present due to closed-chain dynamics and arbitrary contact conditions. In ROTEX, the predictive displays for force were found to be quite close to actual experimental forces.
Even in the case of ROTEX, there are small misalignments and inaccuracies, which are accommodated by local sensing. Force, proximity, and tactile sensing are fused to correct such deviations. For less structured problems, a model of the environment must be built using vision and other sensors. This generic problem in robot vision and mobile robots is far from solved. One envisions that a robot would spend some time mapping its immediate environment, from which a simulation model is extracted; the operator then controls the robot through this simulation model.
Accurate, Real-Time Simulations
Besides predictive control, other uses for telerobotic simulations are training and mission development. Training simulators prevent damage from being done in an operator's learning phase (e.g., learning a surgical procedure), free expensive telerobotic equipment for actual use (e.g., teleoperation of forestry harvesting equipment), and prepare trainees for equipment that is available only at the remote site (e.g., underwater robots) or that cannot function under normal conditions (e.g., space robots that cannot lift themselves against gravity). Mission development involves
extensive simulation to work out operational scenarios; this is particularly important in space.
Three needs, which are similar for virtual environments, are reemphasized here for telerobotics.
Construction of a simulated environment. This involves the software tools for representing real environments, creating a particular virtual environment, and sharing VE modules. As mentioned above, it would also be desirable to "reverse engineer" a real environment into a simulation, by using visual and other sensory recognition and representation methods (see, e.g., Oyama et al., 1993). This is a hard problem, especially if object attributes other than geometry (space occupancy) are to be addressed, such as mass and surface properties.
Accurate representation of task dynamics. Although the laws of physics have been around for a long time, often we don't understand detailed mechanical interactions between objects in sufficient detail for simulation. A very basic example is motion of three-dimensional objects under friction, such as sliding, and in collisions (Mason, 1989). The field of robotics is just beginning to enumerate how such objects are expected to behave and to develop appropriate concepts.
Another difficulty is representing closed-chain systems under arbitrary contact conditions. Examples are locomotion on different surfaces, the behavior of the operator's hand tissues in the interaction with the haptic interface, and the behavior of the remote manipulator with such compliant environmental substances as human tissues (in surgical applications) and rock or soils (in mining applications). Furthermore, there is an issue of experimentally verifying such task dynamics. The particular robot sensing also needs to be simulated (Brunner et al., 1993).
In the continuous domain, accurate finite element models are required, which may be nonlinear. They will require that the constitutive equations for real objects be known or measured. Abrupt changes in boundary conditions, such as simulating the cutting of tissue, are difficult to represent.
Real-time computation. In addition to generating graphical images, there are substantial difficulties in real-time computation of a simulated environment. Often classical multibody simulation is more concerned with accurate long-term integration of initial value problems than with computational efficiency. Other intensive computations are simulating and detecting collisions and real-time calculation of finite element models. Needs in real-time computing are discussed in greater detail below.
The multitude of challenges in real-time computing for VE and
teleoperation systems can be successfully addressed only with computing solutions offering high performance, ease of use, reconfigurability, expandability, and support for massive and fast I/O. Coarse-grained parallel systems, based on the most powerful computational nodes available, which communicate via high-bandwidth, integrated communication links, are the most promising contenders to satisfy all requirements. This has been recognized by academia and industry alike and explains the early strong support for the Transputer processor family and recently the sweeping popularity of C40-based systems.
In this setting, a high-speed intercommunication standard would be highly desirable, in a similar fashion to existing bus standards. This would offer users an immense degree of flexibility, since computational nodes could be mixed and matched across vendors and different processor types. Naturally, this communication standard must go hand in hand with a physical module standard, akin to Texas Instrument's TIM-40 modules.
Currently, code development is done on traditional host systems connected to dedicated run-time systems. This results typically in a bottleneck at the interface between the two systems. In addition, communicating between run-time code on the dedicated architecture and host programs is typically problematic, due to the difference in development tools, architecture, and response time. Based on the emerging paradigm of coarse grained parallelism based on standardized communication, the boundary between development and run-time hardware could become transparent. Since the host architecture itself could be based on the same paradigm, communication between host (development system) and controller (run-time system) could be fast and flexible, as there could exist numerous communication links between the two. Furthermore, the boundary could be laid dynamically, depending on the requirements of each system.
If the input/output devices for control, teleoperation, and VE hardware adhere to the same communication standard, much effort in system development could be eliminated and progress toward powerful real-time teleoperation and VE systems would substantially accelerate. It is likely that, with the increased interest in telerobotics and VE, there will be sufficient thrust to finally address the I/O standardization issue in addition to reducing cost through volume. It should be reemphasized that, in contrast to I/O system interfacing, custom software developments for real-time multiprocessing systems are no longer necessary, and the effort should rather go toward identifying the most suitable commercial products and focus on industry-academia collaboration to develop one or several compatible standards.
Better Robot Hardware
A goal in teleoperation is that performing a task with a remotely controlled manipulator should feel nearly indistinguishable from directly performing the task with one's own limbs. Shortfalls in this goal are primarily due to the mechanical hardware, both on the master and slave sides. Often, limited-performance masters are hooked up to limited-performance industrial robots. Although comparisons of different master-slave control strategies have been published for these systems, it is unlikely that the results generalize beyond the evaluation of these specific low-performance systems.
On the slave side, robot arms and hands are required with sufficient dexterity and responsiveness to match that of the human arm. To achieve such devices requires improvements in actuators, sensors, and structures. Some particular needs are discussed below.
Multiaxis, High-Resolution Tactile Sensors
The robot must have a comparable sense of touch to a human, even before considering how to transmit the sensation to an operator. Tactile sensors continue to be a problem in robotics: hardly any robots have them, and those that do sense touch only coarsely. Although interesting designs have been proposed, few have yet been realized in a usable form. Some robotics researchers are working with tactile physiologists to look for correspondences between physiological and robotic designs, for example, the use of accelerometers to act like Pacinians and piezoelectric strips to sense shear (Cutkosky and Hyde, 1993). Attention to robot skin mechanics is important, for example, the existence of fingerprint-like ridges as stress amplifiers.
Robust Proximity Sensors
As a step toward supervisory control, small adjustments in grasping or an approach to a surface should be performed with local sensing. The intelligence required is much less than for the general case of autonomous control. A highly sophisticated gripper on ROTEX (Hirzinger et al., 1993), incorporating proximity, tactile, and force sensing, was the key to the success of the predictive display control. Proximity sensors have typically been ultrasonic, electromagnetic, or optical. All have limitations in range, accuracy, and sensitivity to different surface conditions. Visual time-of-flight systems are not yet practical because of complicated electronics, but they could be a future solution.
Multiaxis Force Sensors
Multiaxis force sensors would typically be mounted on a robot wrist, to measure the net force and torque exerted on the end effector. Miniature force sensors would also be useful on finger segments, to control fingertip force accurately. Key issues at the moment are cost, robustness, and accuracy. For wrist sensors, inaccuracies of a few percentage points due to cross-coupling effects are typical and are problematic. A possible solution is based on magnetic levitation.
Robot limbs should be strong and fast yet should be able to interact gracefully with the environments. Better actuator and transmission designs are the key. Improvements in electric, hydraulic, and pneumatic drive systems are required. Novel actuators such as shape memory alloy and polymeric actuators look promising in terms of force to weight; the result might be lightweight, responsive limbs that accurately track human motion commands and faithfully reflect back contact conditions.
Hardware considerations on the master side are very similar. The requirements of a human wearing or interacting with a haptic interface are even more demanding and involve concerns of safety, convenience, and bulk. If reasonably performing devices cannot be obtained at reasonable costs, the spread of such systems will be limited.
Improved Telerobotic Controllers
Even without time delays, there are unresolved issues about how to best implement controllers for master-slave systems. One reason is limitations in hardware, as mentioned above. Another reason is a lack of tools with which to quantify and evaluate teleoperator performance. There are deficiencies in taxonomies of tasks: we don't understand well enough at a detailed level what we want these systems to do. At a basic level, we don't completely understand human sensorimotor abilities, for example, the discrimination of arbitrary mechanical impedances (stiffness, viscosity, and inertia) (Jones and Hunter, 1992). Clearly this knowledge is necessary to set design goals for telerobotic systems. We also need to understand how more complicated tasks decompose into more basic tasks, which can then be measured and used as discriminators between telerobotic controllers. As mentioned earlier in this chapter, robotics also has this goal.
As an example, it is not fully understood when to apply rate control versus position control, or how to include force feedback into rate control.
Other examples were mentioned in the section on low-level control. When is force reflection more advantageous than the position control of a locally force-controlled robot? If force reflection is used, what is the exact form of controller that best meets goals of robustness and stability? There is still considerable work to be done in this area.
A new set of issues arises from scaling: teleoperation of very small and very large robots. The mechanical behavior of objects in the micro domain is very different than in the macro domain. One problem is that the dynamics will be very fast; somehow movements will have to be slowed down for the operator.
As robot autonomy improves, so will the level of supervisory control. A number of functions could be increasingly automated:
Path planning and collision avoidance. The main issues here are efficient routines and obtaining geometric descriptions of the environment (Latombe, 1991).
Trajectory specification. Any trajectory has to stay within system constraints of joint limits, actuators, and safety concerns.
Grasping. The robot should be relied on to obtain a stable grasp and to regrasp as necessary.
Intermittent dynamic environments. Trajectories should be modified in real time subject to changes in the environment. For example, a robot may swerve to avoid hitting someone entering its workspace. Some forms of hand-eye coordination, such as catching or hitting, may require a speed of response not possible with teleoperation.
Force control. With more sophisticated abilities to interact with the environment and to complete such tasks as the generic peg-in-hole problem, the need for force reflection will diminish.
A step toward such autonomous control capabilities would be a higher-level transfer of skills between the operator and the telerobot. The idea is to program by kinesthetic demonstrations: the human makes a movement, this movement is measured, and the telerobot extracts symbolic information about how to accomplish the task (Funda et al., 1992; Ikeuchi, 1993). This differs from direct manual teleoperation, in that an exact trajectory is not being commanded, but rather a strategy for completing a task. Difficulties particularly present themselves in transferring force control skills.