Specific Applications of SE Systems
DESIGN, MANUFACTURING, AND MARKETING
Computer technology generally, and synthetic environment (SE) technology more specifically, are potentially important drivers for future developments in design, manufacturing, and product marketing. Trends already under way suggest a movement toward the development of manufacturing systems in which production processes are integrated with all elements of the product life-cycle from concept through sales, including quality, cost, schedule, and the determination of user requirements. This process, known as concurrent engineering, provides for parallel development across product life-cycle activities through the use of technologies such as computer-aided design/computer-aided manufacturing (CAD/CAM) and computer-integrated manufacturing (CIM). Using shared databases, customers, designers, and production managers can simultaneously evaluate a proposed product design. As a result, the design of the product, as it evolves, can incorporate the requirements of the user, special needs for marketing, and any limitations of the production process (Krishnaswamy and Elshennawy, 1992). Once developed, advanced visualization technologies such as virtual environments (VE) may provide valuable extensions to current practices.
In April 1993, manufacturing was named as one of six national initiatives to be administered by the Federal Coordinating Council for Science, Engineering, and Technology (FCCSET). The primary focus of the FCCSET mission on manufacturing is to assess special opportunities for
technology to change the way manufacturing will be carried out in the next century.
Although the ultimate goal of all manufacturing is to produce a tangible object or component, information—in the form of plans, specifications, and processes—plays a most important role. Thus, it should be expected that information technology, including VE, could have a meaningful role in manufacturing by enabling people to generate and manage such information more effectively. Consider the following points in the life-cycle of a manufactured product:
Developing design requirements. VE could be used as a medium in which a customer's mental image of a product can be fashioned into a virtual image of the product. That image could be subsequently manipulated or even used as the basis for production specifications. Examples: architectural walkthroughs of spatial designs, such as proposed buildings, rooms, and aircraft interiors.
Undertaking detailed design. VE could provide designers with the ability to reach inside the design and move elements around, to test for accessibility, and to try out planned maintenance procedures. The designer could thus have a comprehensive view of how changes made in the design or placement of one component could affect the design of other system components.
Producing the artifact. Virtual pilot lines could simulate both human and machine processes on the production line. Such a virtual pilot line could be used to predict performance and to diagnose the source of faults or failures. Plant management could be improved as engineers are given the capability of reviewing and modifying various plant layouts in virtual space.
Marketing the artifact. By providing potential customers with the ability to visualize various uses of an artifact, VE could be used for marketing an array of completed product designs to customers prior to their production.
Specific Manufacturing Applications
Building Prototypes Electronically
Building prototypes electronically provides a number of advantages, including the opportunity for sharing data across manufacturing functions and the ability to modify designs with greater ease than in a physical mock-up. A further advantage is the ability to incorporate stress and
durability test data into the design process without physically performing each test. VE promises to enhance the value of prototyping electronically by offering customers, sales staff, and engineers the ability to walk around the product and manipulate it in virtual space, much the same way as they would explore a physical mock-up in real space. A long-range goal is to create VE systems that can be extended to provide groups of individuals in different locations with the capability to work together in a shared virtual space.
Researchers at the University of North Carolina (Airey et. al., 1990) have worked on the development of software for creating interactive virtual building environments. This software can be used to present architectural walkthroughs of buildings that have not yet been constructed. In touring a virtual building, an individual will be provided with changing views and lighting that are consistent with his or her position relative to the building space. Such software can be useful for design of any interior spaces, including industrial buildings, hospital operating rooms, churches, homes, and aircraft passenger compartments, to name a few.
Electronic Configuration and Management of Production Lines
Another potentially important area for the application of VE technology is in the design and testing of the processing, fabrication, and assembly lines. Virtual pilot lines might be developed instead of real pilot lines to simulate human and machine tasks and make predictions about potential problems for human performance and safety as well as estimating the probability of failure and the line's expected operating efficiency. The promise is that virtual pilot lines will be far easier to modify in response to diagnosed problems than a physical pilot line, and they will provide the opportunity to introduce information on manufacturing efficiency early in the product design process. In addition, a virtual line could be run in parallel with an operating line for purposes of diagnosing failures, retooling for new products, or changing human-machine interface designs or procedures at points in the process at which errors or problems are occurring.
Although VE technology provides more of a promise than an existing capability for industry, several forces within various government and manufacturing enterprises will push for its development and use. From the industry perspective, VE technology has the potential to make the manufacturing process (from planning through sales) more flexible and economical. The aerospace, automobile, and textile industries are pursuing VE technology as a means for speeding development and making product modification easier. Chrysler, Ford, and General Motors have formed a VE consortium with the U.S. Army vehicle center, the automotive
division of United Technologies, the University of Michigan, and several small companies. In a recent proposal to the Advanced Research Projects Agency (ARPA), the consortium predicted that VE technology would lead to improved product design, a better market response, and reductions in time or cost (Adam, 1993).
In the following sections we provide a discussion of the potential for VE in the textile and aerospace industries. The selection of these industries was not based on an exhaustive or systematic search of industries and applications; however, both industries offer some interesting illustrations of VE technology that transfer broadly to other industries.
VE may have very important applications in the marketing and manufacture of clothing. The concept is that customers could shop for apparel in a VE in which they would see virtual clothes on virtual images of their own bodies and feel how the clothes would fit. On the basis of this experience, customers would select and order outfits that would be fabricated on demand and sent out to them within a short time period. The result would be to significantly reduce financial losses associated with fabric waste during apparel production and with product markdown and liquidation. Moreover, the customer would be provided with a greater range of choice and an improved made-to-measure fit. This approach appears to be a natural extension of the current market trends of increased shopping through catalogs and home shopping networks and the accompanying decrease in retail outlet shopping.
Industry Efforts VE technology has captured the interest of the textile industry (Steward, 1993). In 1993, a collaborative research and development program, the American Textile Partnership (AMTEX), was initiated between the Department of Energy (DOE), the DOE national laboratories, and the fibers, textiles, and apparel industry to improve the competitiveness of the U.S. textile industry through the application of technology. The national laboratories plan to work together and coordinate with industry through major industry-supported research and technology transfer facilities. Matching funds for the partnership are to be provided by government and industry. The first joint project between the national laboratories and the industry will involve the creation of an industry model for integrating hardware and software in a system to provide Demand Activated Manufacturing Architecture (DAMA). One aspect of this
effort will involve research on the uses of VE technology (Hall and Walsh, 1993).
The U.S. textile industry, which includes fiber producers, textile weavers, apparel makers, and retailers, employs over 1 million workers (10 percent of the manufacturing work force in the United States) and includes 26,000 companies. It is the largest producer of nondurable goods, experiences annual consumer sales of approximately $200 billion, and contributes $53 billion to the U.S. gross national product (Hall and Walsh, 1993; Steward, 1993). Each year the industry fails to realize revenues of approximately $25 billion due to inventory markdowns and liquidation.
Most companies are small, with profit margins of 2 percent or less, and so are not in a position to conduct or support research. Almost all the research in the industry is conducted by five large research centers based in universities and jointly funded by industry and government. One of these centers, the Apparel CIM Center, was established in 1988 with the goals of removing barriers to adopting proven CIM technology, establishing CIM standards, providing assistance to state industry, and conducting broad-based research and development to keep the industry competitive.
The primary charge of the Apparel CIM Center is to investigate applications of VE to clothing as seen, examined, and purchased by the retail customer. A second charge is to apply VE technology to represent the internal view of a textile manufacturing plant, including the position of machines, the air conditioning, the noise level, and the lighting. The goal of the project is to facilitate the reorganization of a manufacturing plant by providing engineers and factory workers with the ability to walk through a virtual plant; to move machines around on the basis of requirements to produce new lines of apparel (seasonal changes); to examine spacing, lighting, and noise to ensure good human factors practices; and to assess the effects of various equipment configurations on work flow.
Technology Requirements The technology required for implementing the internal plant layout includes: (1) building an object database of all equipment needed in the plant, (2) creating the capability to determine light and ventilation, (3) providing noise levels based on the combination and spacing of machines, (4) matching lighting requirements and noise levels against federal requirements, (5) integrating new software capabilities with existing simulations of work flow through the plant for manufacturing different products, and (6) developing an interface for engineers that is easy to use and acceptable. According to Steward (1993), all of these activities are under way. There are many technologies, including VE, contributing to this application—some existing and some in development. As these activities evolve, there will be a need for VE technology to rely on and interface with other developing information technologies.
Developing the technology required to fully implement a VE system for marketing clothing is a long-term effort. One area for development is body measurement technology. Currently the concept is to have the customer don a body stocking and be electronically scanned. The linear and volumetric dimensions from the scan would be stored on a card that the customer would use when entering the virtual shopping space. When a customer's dimensions change, he or she could be scanned again. Cyber-ware has built and demonstrated effective full body scanners. However, this technology is produced on an individual basis and is expensive to acquire.
A second area of development is the technology for accurately representing material draping. A critical factor in deciding to purchase a garment is appearance: how the jacket hangs, how the folds appear, how the fabric moves when the individual wearing it moves, etc. Thus, the draping of virtual clothes on a virtual customer must appear real.
Other areas requiring technology development include providing accurate colors in the virtual world, giving customers the opportunity to ''feel" fabrics, and providing customers with a sense of how the garment "fits." All of these factors are important to customers in selecting clothing. Colors must be accurate so that different parts of an outfit can be matched; feel and fit are critical to comfort and style. Of all the research and technology development issues identified above, the most complex and long range will be developing the tactile feedback needed to create a sense of fit.
The aerospace industry is expected to be a major user of VE technology in the future. Companies such as Boeing and Rockwell International have long-range plans to develop VE systems that will provide all interested parties with the ability to view and interact with three-dimensional images of prototype parts or assemblies of prototype parts. Currently, both companies are using CAD tools to create electronic prototypes of parts in lieu of physical mock-ups.
Industry Efforts Staff at Rockwell International, through its Virtual Reality Laboratory (Tinker, 1993), are working on virtual prototypes and mock-ups; virtual world human factors assessment for proposed task environments; and training for manual factory workers, maintenance personnel, and equipment operators. These efforts are in the early stages of implementation. Proprietary software has been developed to read CAD data into a virtual reality database. The long-range goal is to provide the ability for multiple participants to work together in a shared virtual space interacting with high-resolution CAD data in real time.
At Boeing, the design of the 777 aircraft is being accomplished without a physical mock-up; all of the 6.5 million parts are being prototyped electronically using CAD tools. As a result, designers, engineers, and possibly customers see only models of parts or assemblies of parts on a computer screen. Although this approach provides for shared databases among design, manufacturing, and sales components and adds significant flexibility to the design process, it takes away the ability to walk around, explore, and manipulate parts. As a result, Boeing is working toward the development of a VE system that would give these capabilities to designers, engineers, customers, and marketing personnel.
According to Mizell (1993), the plans for Boeing's VE project include: (1) giving designers the ability to reach in and move parts assemblies around, (2) conducting human factors tests in virtual space, using human models, to determine whether maintenance can be accomplished and control operations can be easily performed, and (3) providing customers with a variety of customized aircraft interiors to walk through and make modifications in real time. All of these applications feed directly back into the design process. Cabin layout modifications made by customers influence the placement of wiring, ventilation, windows, seats, etc. Mizell believes that Boeing would consider the work in virtual reality a success if its only use was to provide customers with the ability to walk through and experience various configurations of aircraft interiors.
Implementation of the project is in its initial stages. But Boeing already has software that reads CAD data into a VE preprocessed database. A limiting factor at this point is the computing and graphics power needed to represent the CAD database in a three-dimensional virtual space so that real-time interaction and a feeling of presence can be facilitated. It is anticipated that VE technology will begin to contribute to Boeing's productivity in the next two years, but development will probably need to be continued over a 15-year time period (see the more detailed discussion of implementation issues in the section on technology requirements).
A second major project area at Boeing is the application of augmented reality to various parts of the manufacturing process. This project seeks to eliminate the need for complex assembly instructions or manually manipulated templates by creating a system in which computer-produced diagrams are superimposed onto work pieces. The technology to accomplish this goal involves a head-mounted see-through display and a head-position-sensing and real-world registration system. The augmented-reality project is being developed to assist factory workers in performing many complex, manual, skill-based tasks that rely heavily on human perception and decision making and therefore are not easily automated. Currently, guidelines are presented to workers in the form of overlays, templates, or written instructions for each step in the process. When parts
or processes are modified by designers in the CAD system, a substantial amount of time may be required to reflect the appropriate changes in the manufacturing documentation. The augmented-reality system would link manufacturing instructions with the CAD system and superimpose these instructions in the form of diagrams on work pieces. The diagrams would appear to be painted on. For each step, a new diagram is projected. The link with the CAD system would make it possible to show changes in design or procedures to the worker immediately. A more detailed description of this project is provided by Caudell and Mizell (1992).
Technology Requirements In Boeing's augmented-reality project, a prototype system has been developed and tested. The primary technology needs are for a comfortable head-mounted color display with a field of view wider than 30 degrees. Another goal is a position-tracking system that will leave the worker untethered.
In order to implement Boeing's vision for using VE, several areas require technology development. One critical problem is the lack of graphics and computing power. The CAD database for the 777 aircraft contains between 5 and 10 billion polygons. Even though only a fraction of the database may be needed at any one time, the existing graphics hardware limits the ability to create a scene that is interactive in real time, particularly because of the complexity of the geometry in the CAD database. The problems created by the size of the database, the inadequate hardware, and the requirement for a VE that looks real and behaves in predictable ways underscore the need for research on real-time scheduling, assigning reduced workload areas, and developing heuristics to accomplish graceful degradation.
Another goal requiring technology development is providing engineers with the ability to interact with objects in virtual space. Currently, Boeing is working with a mannequin developed by Norman Badler at the University of Pennsylvania (Badler et al., 1993) that can be put inside the CAD geometry and changed in size or shape. Similar technology has been used by the automobile industry for several years. The next major step is to develop the capability for an engineer to inhabit the mannequin in a virtual space, to move around inside the CAD geometry, perform maintenance checks, and in general, feel present inside the scene while others monitor from a third person. Particularly important development areas include the need for collision detection and the requirement to give the individual in the virtual space some sense of force feedback, especially when testing the difficulty of performing various maintenance operations. Developments in this area, particularly those involving haptic feedback, are at least 10 to 15 years in the future.
Creating architectural walkthroughs of customized aircraft interiors
is another important area for development. These models would provide customers with the opportunity to see and experience the aircraft they are purchasing before it is actually built.
MEDICINE AND HEALTH CARE
The knowledge base of medicine has exploded in the past 30 years, and it continues to expand at a staggering rate. As a result, medical practitioners have difficulty in keeping pace with changes in practice, and medical students and residents have difficulty in assimilating the information presented in their medical educations.
As in other information-intensive disciplines, computer and communications technologies have important roles to play in reducing the cognitive demands on medical practitioners and students by helping to manage, filter, and process multiple sources of information. The following subset of medical knowledge and skill is well-suited to management and handling by VE, augmented reality, and teleoperator systems:
Anatomical relations of various organs and systems. Knowing that a particular organ is located underneath another organ is an essential part of anatomical knowledge. The ability to "walk through" the body and to see anatomy in its natural state with all of the interrelations of various organs and systems would greatly facilitate the acquisition of certain important pieces of anatomical knowledge.
Development of manipulative skills involving precise motor control and hand-eye coordination. Surgical trainers can be particularly useful in acquiring needed skills.
Image interpretation. Although various imaging devices are common in medicine, their effective use depends on the skill of the viewer to identify often small differences between normal and abnormal images.
Telemedicine through teleoperation. Medical expertise is often unavailable in remote areas. Telemedicine—whether through consultation or through remote manipulators that enable teleoperation—offers some potential to place medical expertise in locations that might not otherwise have access to such expertise. Teleoperation also enables one to effectively transform the sensorimotor system of the physician (diagnostician or surgeon) to better match the task.
Specific Medical Applications
The following discussion focuses on six applications of VE technology to medicine: medical education, accreditation, surgical planning,
telepresence, telesurgery, and rehabilitation. Each subsection addresses a long-range vision of how VE might assist in these applications and a description of possible near-term demonstrations.
Preservice and Continuing Medical Education
Medical education has changed little in the last 30 years, despite enormous advances in knowledge. Most medical schools emphasize learning facts by rote. Information is provided in a lecture format, and students study outlines for endless hours in the library. Little effort is expended to place the information into a context or framework that might help to structure and organize seemingly disparate facts. As a result, students must use their own, perhaps incomplete experience to begin assimilating the data and creating a logical, integrated framework of anatomy, physiology, biochemistry, genetics, and the myriad springs of subspecialized knowledge from contemporary medical research.
The teaching of anatomy is illustrative, and the application of VE and augmented reality to such teaching has great potential. The static, transparent, two-dimensional overlays typical of anatomy textbooks could someday be replaced by a virtual human. Indeed, today the National Institutes of Health is funding the Visible Human, a project to develop a complete static digital representation of an adult human. Once the data are collected, a student would be able to operate a VE system for anatomy that would illustrate the spatial interrelationships of all body organs relative to each other, selectively enabling or suppressing the display of selected body subsystems (e.g., displaying only the digestive system, viewing the complete image without the circulatory system).
A much more sophisticated version of the Visible Human would be a dynamic model that could illustrate how various organs and systems move during normal or diseased states, or how they respond to various externally applied forces (e.g., the touch of a scalpel). Thus, a student could view the heart in normal and diseased states pumping blood, or observe how the stomach wall moved while cutting it.
Today, several virtual worlds have been developed to demonstrate basic anatomy and as rudimentary models of training simulators. One is a model of the optic nerve created by VPL (VPL, Inc., 1991). This model illustrates, in three dimensions, the path of the optic nerve from the retina to the optic cortex. By pointing a finger one can fly along this path, looking to either side at adjacent structures. In this way, less effort is
expended in constructing a three-dimensional image in the individual's mind and more effort is channeled into learning the anatomical relationships.
A second model is a rudimentary simulator for the abdomen created by Satava (1993b). With this simulator, one can travel from the esophagus throughout the intestine, taking side trips through the biliary system and the pancreas. It is a unique instructional tool that describes anatomy from the inside of the intestines rather than from the outside. It is of considerable benefit in training individuals to perform colonoscopy and esophago-gastroduodenoscopy, as well as teaching students the true anatomic relationships of intraabdominal structures. Basically, one is able to fly around various organs and experience their actual relationships—the model provides the learner with the ability to interdigitate between organs and behind them without destroying their relationships to one another in the process.
Another educational tool is an augmented-reality system that allows the user to see virtual information superimposed over real structures. See-through displays provide the user with a view of the surrounding environment, along with an image displayed on goggles. Investigators in Boston and at the University of North Carolina (UNC) have created see-through displays using computer-assisted tomography (CT) scan, magnetic resonance imaging (MRI), or ultrasound technology as imaging techniques (Bajura et al., 1992).
The work on augmented reality at UNC (Bajura et al., 1992) is based on the images from an ultrasound that delineates abdominal structures in three dimensions. Specifically, the investigators created a graphic of a three-dimensional model and projected it through the head-mounted display (HMD), as an overlay onto the user's view of the abdomen. This program, used on a pregnant woman, allows the operator to "open a window" into the abdomen and view the fetus in a three-dimensional manner without incising the skin. Although the application of such programs to view a developing fetus is limited, the technology raises the possibility of visualizing other intraabdominal structures.
See-through models can be used to teach surgeons where an organ is located and show its relation to surrounding tissues. Novice surgeons often have difficulty visualizing the location of the gallbladder and the cystic duct in relation to the common bile duct. Despite extensive anatomy instruction, the first few operations are difficult because structures in a living body appear different from the illustrations in an anatomical atlas. A see-through display gives surgeons in training an opportunity to develop their own internal three-dimensional map of living organs, rather than having to operate without one.
A variety of surgical training applications are plausible as well. Surgeons know there is no substitute for hands-on practice and training. Consider laparoscopic procedures, which involve surgery performed through very small incisions in the body. The advantage of such procedures is that patient recovery time is greatly reduced over conventional surgery, because of the smaller trauma to the body. However, manipulation of tools through the incision, hand-eye coordination, and understanding the spatial relationships of the tools relative to the organs within the body place high cognitive demands on the surgeon.
Today, a surgeon wishing to learn laparoscopic procedures may attend a one- or two-day course designed to teach necessary skills. In the course, the surgeons-in-training begin with laparoscopic cholecystectomy trainers to familiarize themselves with the technique. These trainers are quite rudimentary, consisting of a black box in which endoscopic instruments are passed through rubber gaskets. Trainees use these instruments to practice such tasks as tying knots, grasping structures, and encircling plastic arteries. The current technology of velour-and-plastic organ models are stiffer and harder to manipulate than normal tissue; arteries are not easily transected or avulsed and do not bleed; and damaged organs do not ooze. As a result, the experience is far from realistic. Following work with the trainer, the surgeons-in-training begin practicing these techniques on pigs—the approximation to humans is better, but the anatomy is dissimilar, and the amount of experience is limited by both the cost and the availability of the pigs.
An appropriately constructed VE simulator could obviate some of these problems. For example, the abdominal simulator developed by Satava (1993b) includes several laparoscopic tools, making it possible for the surgeon to practice some (primitive) endoscopic surgical techniques. The current drawbacks of this system are the low-resolution graphics, the lack of realistic deformation of organs with manipulation, and the lack of tactile input and force feedback. However, the model sets the framework for further investigation into surgical simulators. Such a simulator could be used for initial training, as well as for the additional follow-up training, which has been shown to significantly reduce the incidence of post-surgical complications in patients operated on by surgeons with such training (William et al., 1993).
A second application to surgical training is the use of the see-through augmented-reality model to support novice surgeons in performing their first few appendectomies, cholecystectomies, or arthroscopies. After gaining
confidence in their knowledge of anatomy, these surgeons could proceed without the aid of the display (Haber, 1986).1
Close on the horizon are hybrid programs that will combine technology from current surgical simulators with VE technology (Stix, 1992). The pressure and tactile feedback provided to the surgeon could be improved by using the actual tools of endoscopic procedures, instrumented to act as interface devices, to train with a simulator. If the VE program did not have to generate the instruments, computer power could be reserved for producing more intricate displays.
VE simulators might also serve as mentors to teach residents, interns, and medical students the basics of surgical practice, such as suturing, ligating blood vessels, and the basic tenets of dissecting. Operating procedures might be stored in a VE library, ready to demonstrate a Billroth I, a choledochojejunostomy, the reconstruction of an uncommon craniofacial anomaly, pathologic rarities, and uncommonly encountered trauma scenarios.
Building on the groundwork of what has been created to date, Robert Mann's vision of "the ultimate simulator" will be achieved: "a computer environment in which a surgeon will not only see but 'touch' and 'feel' as if actually performing surgery" (Mann, 1985).
The purpose of medical accreditation is to ensure that physicians have a minimum level of skill and knowledge adequate to serve the public. The knowledge and skills acquired by physicians-in-training depend very much on the idiosyncrasies of the patient population and the epidemiology of diseases to which they are exposed. Thus, whereas residents of different programs have extremely disparate bases of knowledge and different levels of skill, even residents in the same program may acquire very different funds of knowledge. For recently developed surgical techniques, training programs may differ even more strongly, lacking standardized training methods and standardized accreditation procedures (see Bailey et al., 1991).2 How are physicians at various points on a spectrum of experience, without any unified teaching, and without exposure to a uniform patient population, to be judged against a national standard?
Today the answer is board certification. Knowledge can be tested by means of paper-and-pencil examinations, but proficiency in operative skills cannot be demonstrated in this way. As a result, individual residency programs are left to judge a physician competent in the operating room.
VE simulators offer some hope for standardizing the surgical accreditation process. For example, a more sophisticated version of the abdominal simulator developed by Satava (1993b) could offer a set of standardized tests for laparoscopic procedures. The ability to track the instruments would enable the National Board of Medical Examiners (or a similar governing body) to monitor performance during a given procedure and to document the types and frequency of errors made, thus providing some uniform means of assessing surgical skills across programs.
An actual medical operation is never performed in the abstract—it is performed on a specific individual whose precise physical dimensions are unique and whose anatomy almost certainly deviates from those found in anatomy textbooks. Thus, to a certain extent, a surgeon confronts surprises every time he or she undertakes an operation.
VE-based surgical planning aids offer a way to reduce the uncertainty. In principle, imaging data for the patient could be used to update a generic digital human model, allowing the surgeon to understand more fully the specifics of the individual. The surgeon could then explore freely various approaches to solving a surgical problem on the VE simulator and could practice the operation if required.
Such an application is a promise for the long term. However, Jolesz and his colleagues (Gleason, 1993) have developed an augmented-reality system for video registration of brain tumors to aid in the planning and performance of surgical resection. In this system, the mass is imaged using either CT or MRI, a three-dimensional construct is created, and the image is projected over the patient's head to plan the optimal site of skin incision and bone flap to expose the tumor. This program is then taken into the operating room to provide a reference map for the surgeons during the resection. Thus, the surgeon is able to consult the image at any time to assess the remaining tissue and the extent of further excision.
A model of the lower extremity envisioned by Mann (1985) and designed by Delp (Delp et al., 1990) is an example of using virtual reality to test various procedures. This model allows the surgeon to "perform" the planned surgery, and then simulate a number of years of walking or other normal activity. The altered model can be reanalyzed at the end of the simulated activity period and the outcome of the procedure evaluated.
This allows surgeons to refine decision-making skills before ever operating on a real patient.
The augmented-reality system designed by Bajura et. al (1992), described above, could also be used for preplanning complex abdominal procedures, such as complex tumor resections. Liver masses, pancreatic pseudocysts, and edematous gallbladders could be viewed in three dimensions, making it easier to estimate size and interrelations with other abdominal organs and to plan and perform invasive procedures.
Telemedicine is the technology that would allow physicians to interact directly from locations thousands of miles apart. They would be in the same virtual "room" when discussing a case or when performing a procedure and could refer, simultaneously, to paramedical data, reducing the likelihood that a miscommunication would occur, that information would be missed, or that results would be misdirected or lost.
The Medical College of Georgia in Augusta has developed and introduced the first stages of a statewide telemedicine system. The system uses interactive voice, video telecommunication, and biomedical telemetry to link rural health care facilities and primary care physicians with large medical centers. As a result, primary care physicians and their patients can consult with specialists without leaving their communities. In these consultations, the participants would see each other and share common diagnostic data and images. The idea also extends to providing remote assistance to surgeons in rural hospitals during surgery. According to Sanders and Tedesco (1993), the system will eventually have five hubs, each serving several clinics and rural facilities. Currently, a test is being conducted with one hub—the Medical College of Georgia—serving five sites. The ultimate goal is to deliver the quality health care available at major medical centers to all underserved areas in the state. Early results suggest that the proposed system, when fully developed, will not only make high-quality care more accessible but will also reduce costs. Telemedicine networks are also under development in Iowa, West Virginia, and Colorado.
Developments are also under way in telesurgery. One example is the Green Telepresence system, created at SRI International (Palo Alto, Calif.) by Phillip Green. The system based on technology used by the National Aeronautics and Space Administration (NASA) for remote manipulation consists of a separate operative worksite and surgical workstation (Rosen
et al., 1994). The operative site has a remote manipulator with surgical instruments, a stereoscopic camera, and a stereophonic microphone to transmit the environment to the surgical workstation. The workstation has a three-dimensional display on polaroid glasses and an interface for the surgical tools. It also has the capability of accepting digital information from CT, MRI, and vital signs monitors.
Hunter and his colleagues at McGill (1993) have developed a prototype teleoperated microsurgical robot and an associated virtual environment that are intended to allow a surgeon to perform remote microsurgery on the eye. The helmet worn by the surgeon controls a remote camera in the operating room. Images from the camera are relayed back to the helmet for the surgeon to view. Tools shaped like a microsurgical scalpel and attached to a force-reflecting interface are provided to the surgeon. As the surgeon moves the tools, he or she causes the microsurgical tool held by the microrobot to move proportionally. The forces exerted by the microrobot are reflected back to the surgeon. These forces are amplified, thus providing the surgeon with an experience of cutting that would, under normal conditions, be imperceptible. The fact that the master and slave computers communicate by optical fiber connection will make it possible for the surgeon and the microsurgical robot to be located at different sites once the system is implemented.
VPL has developed a system, RB2, in which two operators can interact in virtual space (VPL Inc., 1989). Using this program, two surgeons could operate on a virtual patient, one as the initiate, the other as a mentor. Ultimately, a surgeon in New York could help a surgeon in London to do an operation without setting foot on a plane. A data highway rather than a commercial airline would make possible the transmission of complex knowledge (Satava, 1993a).
There is an enormous amount of interest in the promise of telesurgery; however, all current work is at the research and development stage.
There are several ongoing programs in which the use of VE technology is being tested as a means to assist in the rehabilitation of physically and mentally challenged persons. Some of this work is focused on building better human-computer interfaces that take into account the specific physical limitations of the disabled individual. These studies are examining alternative methods, such as eye movements or flexion of facial muscles, for sending bioelectric signals to a computer, which will in turn enable the individual to perform a desired task or see a desired display. At Loma Linda University Medical Center, researchers are developing such interfaces using what they call a ''biocybernetic controller" (Warner
et al., 1994). Their next series of studies will involve the use of bioelectric signals based on muscle activity to provide disabled children with the capability to play Nintendo-type computer games now being played by the general population. Other centers working on the same problem, such as the Children's Hospital in Boston, have proposed studies to examine the motor output capacity of disabled individuals and to match that capacity with control interfaces making use of multimodal outputs.
Loma Linda researchers are also engaged in having disabled persons work with virtual objects generated by a computer as a means to begin rehabilitating their motor skills. For example, an individual could practice manipulating a virtual object at a weight they could handle. It is believed that this practice is useful even if the virtual object weighs less than the real one. In related work, Greenleaf Medical Systems (Greenleaf, 1994) is working toward using VEs to enable individuals to perform tasks that they could not perform in the real world. An example provided by researchers at Greenleaf is creating a VE in which a cerebral palsy sufferer could operate a switch board.
Weghorst et al. (1994) are experimenting with using virtual objects to treat walking disorders associated with Parkinson's disease. According to the authors, objects placed at the feet of these patients may serve to stimulate a walking response. Since using real objects is not a particularly practical approach, virtual objects are being tried with some success. Specifically (p. 243):
Near-normal walking can be elicited, even in severely akinetic patients, by presenting collimated virtual images of objects and abstract visual cues moving vertically through the visual field at speeds that emulate normal walking. The combination of image collimation and animation speed reinforces the illusion of space-stabilized visual cues at the patient's feet.
In another example, Greenleaf Medical Systems is in the process of adapting the VPL DataGlove and DataSuit for use in measuring the functional motion of a disabled person and recording progress over time. Moreover, researchers are working on developing a gesture control system designed to enable individuals wearing the DataGlove to perform complex control activities with simple gestures and on creating a system that will recognize personalized gestures as speech signals. In the latter example, the DataGlove would receive the hand gestures and translate them into signals sent to a speech synthesizer, which then "speaks" for the individual wearing the DataGlove. All of these products are in the early development stage.
In yet another example, VE technology is being used by architects to design living and working spaces for the disabled. The technology now
enables an individual in a wheelchair wearing a head-mounted display and a DataGlove to travel through a space testing for access, maneuverability, counter heights, and reach distances for doors and cabinets.
Finally, the use of VE is being explored for behavioral therapy. Hodges et al. (1993) report on a project at the Graphics, Visualization, and Usability Center at the Georgia Institute of Technology that makes use of VE technology to provide acrophobic patients with fear-producing experiences of heights in a safe situation. Advantages of this approach are that VE provides the therapist with control over the height stimulus parameters and with the ability to isolate those parameters that generate a phobic response in the individual. In another project, Kijima et al. (1994) have been exploring the use of VE technology to create a virtual sand box to be used for virtual sand play. Sand play, a technique used in diagnosing an individual's mental state and providing psychotherapy, involves patients creating landscapes and populating them with bridges, buildings, people, animals, and vegetation. An important advantage of using VE for sand play is that the patient's actions are recorded and can be viewed several times by several trained observers.
Issues to be Addressed
Although the long-term potential of VR for medical applications is suggested by an extrapolation from current demonstrations, a number of research problems must be addressed to fulfill this potential.
Simulations of higher fidelity. The simulated organs and body structures seen by the simulator user are far from realistic, with graphics that are primitive and cartoon-like. Tactile and force feedback, an important consideration in simulating the actual "feel" of a surgical procedure, is mostly absent. Changes in visual perspective are not seen by the user in real time.
Realistic models of organs and other body structures. In current simulations, organs and body structures do not morph as real tissue does; for example, they do not deform with gravity or change shape with manipulation. Blood vessels should bleed; bile ducts should ooze; hearts should pump.
Better image registration techniques for augmented reality. In many cases, a VR surgical or diagnostic aid will require the superposition of images acquired from a number of different modalities (e.g., a CAT scan coupled with an ultrasound image). See-through displays must superpose artificially generated images on the real image through the user's eyes. Techniques for aligning these images so that the right parts correspond to each other remains a considerable intellectual problem.
Appropriate data representation schemes. How can patient-specific data best be combined with generic models of humans in a computationally efficient manner?
Reduction in wide-area network delays. Delays must be significantly reduced to provide for coordinated long distance work.
There are several social issues that will also need to be addressed if VE technology is to achieve wide acceptance in medicine:
Acceptability to care providers. Physicians generally practice from a perspective of conservatism, refraining from the use of techniques that may be unproven, such as the opportunities for surgical training and performance offered by VE technology. Accepted practice, to be encoded in the future as practice guidelines, is widely regarded as a way to ensure that the well-being of patients is not placed at undue risk.
Public opinion. For understandable reasons, the public may feel uncomfortable with the perception that a robot is undertaking surgical operations. Negative public reactions to the recently tested robot used in hip replacement surgery are a case in point. A cultural shift that acknowledges that automation can help care providers do their jobs more effectively will be necessary. Moreover, the necessary shift is bidirectional: patients will have to change the way they regard their care providers, and care providers will have to change the way they present themselves to patients.
To gain physician and public acceptance, convincing demonstrations will be necessary. These systems will have to demonstrate that their use will result in better outcomes, fewer complications, and ultimately even less invasive, less debilitating procedures.
In the current health care environment, a third social consideration must be addressed: cost. Explicit cost-benefit analyses for various technologies are likely to become increasingly common, and VE will be no exception.
TELEOPERATION FOR HAZARDOUS OPERATIONS
Any activity that involves performing tasks in environments that are unsafe for humans, or in which safety is too costly, is a potential application of a teleoperated system. Chapter 9 provides a review of the technology underlying teleoperation; in this section we discuss ways in which this technology has been applied in hazardous environments.
Survey of Major Applications
There are many examples of teleoperation applied to hazardous environments. We provide a short survey of some major applications. In the first two, toxic environments and space operations, detailed scenarios are presented to illustrate critical tasks. These scenarios contain examples that can be considered prototypical of all hazardous environments.
To avoid human exposure to radioactivity or toxic chemicals, teleoperators have been used in handling radioactive and chemically toxic materials, maintaining nuclear power plants, and disposing of hazardous wastes. Handling radioactive materials was the first application for which teleoperators were developed in the 1940s by Raymond Goertz at Argonne National Laboratory near Chicago. Glove boxes are used when more dexterity is required than is afforded by current teleoperators, but advances in telerobotic dexterity should eventually replace them.
Maintenance and cleanup of nuclear power plants typically requires mobile telerobots as well as large manipulators. A variety of wheeled or tracked mobile robots have been designed that carry sensors and manipulators through power plants (Fogle, 1992; Trivedi and Chen, 1993). Large manipulators are required to reach inside reactor vessels for maintenance (Munakata et al., 1993; Rolfe, 1992). More generally, telerobots are required to identify and handle toxic materials at accident sites (Stone and Edmonds, 1992).
The disposal of hazardous wastes is a major public concern. For example, the U.S. Department of Energy has an extensive program to retrieve low- and medium-level nuclear and chemically toxic wastes of nuclear weapon fabrication from desert disposal sites and to place these wastes into improved containers and sites (Harrigan, 1993). At these desert disposal sites, most of the waste is stored in large concrete tanks or buried in 100 gallon drums whose location is known only approximately. Removing these wastes (and any soil that may have been previously contaminated through leaks) without causing further environmental damage and without endangering human life is a challenging task.
Another challenging task is disposing of the high-level nuclear waste from commercial nuclear reactors and government facilities; high-level waste is currently stored under water in storage pools. As these facilities fill up, current plans call for placing the wastes in deep underground storage facilities. Moving the material from storage pools to a processing facility where they can be more easily handled (e.g., a glass vitrification plant) to the final storage facility will require the assistance of remote handling operations.
In order to discuss the application of hazardous waste disposal in more detail, we take as a scenario the removal of hazardous waste from an underground storage tank, such as at Hanford, Washington (Harrigan, 1993), which has been demonstrated in part in laboratory settings. Sensors and manipulators are needed that are immune to the chemical, radiation, and thermal stresses that would destroy ordinary video camera lenses and sensing elements, manipulator lubricants and mechanisms. Moreover, the hazardous nature of the materials being handled places a premium on high reliability. Manipulators for gaining access to storage vats must be large yet have the ability to work in confined spaces. Such systems will be extremely expensive. Since there are many waste sites to handle, advances in supervisory control are required to speed up operations.
The first problem is that the contents of each tank is not known. Hence vision and proximity sensors have to be inserted by a manipulator to map the tank's contents. In this mapping phase, the manipulator is manually controlled; proximity sensors are used to approach surfaces without collisions using local control. In this operation, it is critical that the tank sides not be contacted to avoid rupture. From these data, a representation of the tank's contents is constructed with operator assistance. For recognizable objects such as pipes, an object model can be specified. For waste surfaces and nonrecognizable objects, a three-dimensional surface representation is employed.
The model of the tank's contents is then used to generate a graphical image and to perform simulations of operations. Tasks and robot motions are planned through real-time graphical simulations: sequences are developed, operations are dynamically simulated, and collisions and joint limits are checked. Just prior to a real operation, a simulation can be run to verify the expected result.
For the actual task, the operator relies heavily on the graphical display and simulation to control the telerobot because of poor visibility in the tank. Sensing during the operation is employed to update and modify the graphical model. As a motion develops, the simulation is run simultaneously to check for emerging problems such as collisions or other unsafe operations. Approaches to surfaces and objects will be conducted in local mode, using proximity and contact sensors in the end effector. Wherever possible, autonomous control will be employed to speed up such operations as grasping objects, cutting pieces of the structure loose, and conveying waste from the tank.
Achievement of remote manipulation from the earth to the moon dates from the 1967 United States Surveyor and the Russian Lunakhod,
which followed shortly thereafter (both were unmanned lunar roving vehicles). Canadian-built remote manipulators have served the space shuttle project on many separate missions for loading and unloading the cargo bay in space and to provide a stable base for astronauts performing extravehicular tasks. The deep-space probes have been the most impressive teleoperators, e.g., the Viking was controlled from earth even when at the very edge of the solar system, several light-hours away.
The planned space station construction is motivating a great deal of space telerobotics work, particularly in the United States, Germany, Japan, and Canada. These considerable efforts have resulted in some of the most advanced telerobotic systems, which are developing the kinds of generic capabilities that will be required in applications other than space. Uses for space telerobots include construction and maintenance of the space station, satellite servicing and repair, and space laboratory work.
For the present scenario we concentrate on space laboratory robotics, motivated in part by the success of the German ROTEX effort (Brunner et al., 1993; Hirzinger et al., 1993). There are two main reasons for conducting laboratory and manufacturing operations in space with robots. First, humans are not aseptic enough for clean-room applications such as crystal growing. Second, control of the robot from the ground allows full utility of the space laboratory, because operations do not require the active participation of astronauts.
A main challenge therefore is teleoperation under variable time delays on the order of several seconds. Predictive displays and local autonomous control are the approach that has been taken (Sheridan, 1993b). Space station environments, unlike those of hazardous waste disposal, are likely to be well known in advance, so a detailed simulation of the workspace can be fashioned. The main difficulty is to model the dynamics of the manipulator and objects in the contact tasks; the goal is to accurately simulate the contact forces for the predictive display.
The predictive display is used in the first instance for operator training. An operator interfaces to the real-time simulation via a stereo graphical display and force-reflecting hand controllers. Display enhancements are added to aid the operator (see discussion in Chapter 9).
In addition to training, the predictive display is employed for task planning and preview (Kim and Bejczy, 1993; Kim et al., 1993). Prior to sending a motion command to the remote robot, the operator can rehearse a task to ensure its outcome. The resultant motion can be replayed or relayed to the slave robot; otherwise the operator may directly control the robot after feeling confident about the outcome.
During an actual space laboratory task, the operator may view not only the stereo graphical display but also a time-delayed real image superimposed with a phantom display. The phantom display is an outline
of the graphical image of the robot superimposed on the real image. This phantom display responds immediately to an operator's commands, indicating where the robot will go from its current position.
Enhancements on the real image can be performed to improve visibility, such as removing bright specular reflections and increasing the contrast in dimly lit areas. Remote cameras automatically track the end effector to keep the manipulator in view or to focus on the task. Voice commands are employed to switch a camera onto a monitor or to zoom in on a known location. A separate graphical display provides a more global view for operator orientation.
Commands to the slave robot include not only position specifications, but also nominal sensory patterns such as for force. Differences between the real and simulated worlds are accommodated by local sensing and intelligence of the robot; the degree of autonomy is controllable and restricted, because any adjustments are small. The local sensing in the robot's end effector includes six-axis wrist force sensing, optical proximity sensing (both long and short range), and tactile sensing (Hirzinger et al., 1993). This local sensing can also be employed in dynamic tasks, such as catching a free-floating object, which might not be possible with long time delays.
Many elements of the scenario described above were successfully completed by the German Space Agency's ROTEX telerobot (Brunner et al., 1993; Hirzinger et al., 1993), during a flight of the space shuttle Columbia in April and May 1993.
Teleoperated Heavy Machinery
Tractors, diggers, cranes, and dump trucks have been controlled remotely for construction, demolition, earth moving, forestry, farming, and mining. Excavators and cranes are now being fitted with hand controls and sensors so that the end-point may be directly controlled instead of the arm joints (Wallensteiner et al., 1988). Force feedback is being supplied to the operator to accommodate the hardness of the ground. Teleoperated bulldozers and diggers have been employed to remove contaminated soil (Fogle, 1992; Potemkin et al., 1992). Grapple yarders and log loaders have been teleoperated in the forestry industry (Sauder et al., 1992). In mining, teleoperation is being applied for haulage and drilling (Kwitowski et al., 1992); low-bandwidth communications through mine shafts offer major difficulties.
Manipulators mounted on teleoperated cherry pickers have been applied in remote power line maintenance, tree trimming (Goldenberg et al.,
1992), and firefighting. In the last five years, several companies in Japan and the United States have developed teleoperator systems capable of changing insulators and performing other repairs on active high-tension power lines. Current methods require the use of rubber gloves (on low voltage lines) or several-foot insulated "hot sticks" for high voltage lines. The human line workers are boosted to within close proximity to the active wires in buckets on the ends of long hydraulic arms. However, due to instability of these devices or human errors, workers have inadvertently been burned and electrocuted. Newer manipulator devices allow workers to stay much farther away in the buckets or to operate from the ground using remote video.
Windows and outside surfaces of buildings must be cleaned, painted, and repaired, but the cost of doing so is high and dangers significant. Many modern buildings have platforms that can be raised and lowered by motor drives and moved laterally around the building. Yet since the geometry of the buildings is known in detail, it should be possible to perform all the necessary tasks by teleoperation. For inspection of structures, experimental vehicles have been demonstrated that scale walls (Hirose, 1987; Nishi et al., 1986).
A more unusual application of teleoperation is the inspection and cleaning of the insides of pipes. Pipes are essential elements of most buildings and plants (especially in nuclear and chemical plants with high pressure pipes), yet they are often difficult to inspect or clean. Experimental prototypes have been developed, but no large-scale effort has yet occurred (Fujiwara et al., 1993; Fukuda et al., 1987; Okada and Kanade, 1987). Eventually such devices could also be used in water, gas, and even sewer pipes, many of which are aging and leaking. Some teleoperators for this application drag cables behind them, although developers have envisioned autonomous self-powered devices that make surveillance journeys for significant distances before returning with their findings.
Firefighting and Security
Although teleoperated fire rescue and firefighting operations are not currently undertaken, it is clear that firefighters could use a remote vehicle capable of entering a burning building. Development of such a fire rescue teleoperator would make an excellent national demonstration project. There is no requirement that is not technologically feasible. A remote firefighting vehicle would be capable of crawling up one or more flights of stairs under remote human control; entering various rooms; and allowing the human operator to look around (with better vision than a human eye and most likely with camera pan and tilt controlled by a head-mounted display) to find and to give instructions to persons who are
found. It would include fresh air breathing masks, an insulated compartment into which a person could crawl and be brought out to safety, and foam dispensers and other fire-extinguishing gear. Furthermore, the vehicle would probably be battery-operated so as not to have to drag a power cord and risk getting snagged, although conceivably it could unroll a cord behind it to avoid entanglement.
Several firms are already marketing simple wheeled sentry robots capable of guiding themselves along paths laid out with magnetic or optical markers, listening for unexpected sounds, or looking for unexpected people or objects. Future police teleoperator capability should include radio-controlled stair-climbing teleoperators equipped with low-light-level cameras capable of seeing in the dark, which are able to dispense tear gas or even to fire weapons.
Explosive ordnance disposal is an area that is very well matched to teleoperations, and prototype teleoperators have been developed for this application. A teleoperator could approach a suspected bomb with the intent of first inspecting it, then disarming it if possible, covering it with something to limit damage from detonation, or carrying it to another location for detonation.
Finally, a number of teleoperated systems have been developed for military surveillance, sabotage, and warfare. Some take the form of aircraft, such as cruise missiles, and remotely piloted vehicles and helicopters. Some are land vehicles, capable of laying down kilometers of optical fiber for high-bandwidth surveillance and control. Still others are undersea vehicles, capable of performing surveillance, sabotage, and weapons delivery. Some are manually controlled; others are preprogrammed telerobots.
Bulk Transportation Environments
Loading and unloading operations—whether within plants or to or from shipping docks, ships, trucks, trains or aircraft—tend to be unique with respect to the relative positioning movement to be made (location of receiving vehicle relative to initial location of container or product). Some bulk materials such as liquids, coal, and grain can be handled by pouring or pumping, so that remote manipulations are primarily for the purpose of locating the transfer pipe or duct relative to the receiving container or vehicle. Sometimes this manipulation is dynamically tricky, as in air-to-air refueling and cargo transfer between two ships on rolling seas. By sensing the (continuously varying) relative position of the receiving vehicle with reference to the sending vehicle, the teleoperator can be commanded to null out the difference in platform positions, while the human operator controls the cargo transfer as though the two platforms were fixed relative to one another. When massive objects are being transferred,
manual control must be executed slowly because of large inertias. Computer-aided control can make use of predictor displays to anticipate the effects of control actions before they are imposed, thus adding lead stabilization.
Two widely publicized deep ocean-teleoperations were used in the recovery of the accidentally air-dropped hydrogen bomb off the Polomares Islands in 1967 and the discovery and exploration of the ship Titanic. In the former, the Navy vehicle CURV (cable-operated underwater remote vehicle), equipped with a pressure-sealed camera, lights, and manipulator, was dragged until the parachute attached to the bomb could be found and grasped. In the second case, the Woods Hole Oceanographic Institution submersible vehicle Argo was passively towed at the end of a several-mile-long cable, and its cameras, lights, and powerful sonar were used to discover the Titanic (Whitcomb and Yoerger, 1993). The small ROV (remotely operated vehicle) Jason then explored inside the Titanic at the end of its power and signal umbilical. Deep-ocean teleoperators have also found a number of other sunken ships, historic artifacts, and buried treasure, as well as discovering the existence of hydrothermal vents deep on the ocean floor.
Deep-ocean vehicles, sensors, and manipulators have been used to survey the ocean bottom topographically, geologically, and biologically. It has been estimated by some biologists that more than 90 percent (volumetrically) of the earth's ecosphere has yet to be explored—namely, the oceans below the surface—and it is already clear that the oceans are full of creatures at all depths. Teleoperators appear to be an ideal way to perform this exploration.
A more commercial application of teleoperation in ocean environments has been in the area of off-shore oil exploration. Many petroleum wellhead preparations have been performed by teleoperators. Still other operations that might have been done by teleoperation have been neglected, such as inspection and repair of the legs of oil platforms (some of which have broken up in heavy seas because their welds cracked) and inspection of outflow pipe lines (some of which have burst for similar reasons). Many associated underwater robotic technologies still need much development, such as cleaning, inspection of welds, and rewelding.
Infrastructure and Research Needs
The usefulness of teleoperation in a particular hazardous domain may not be at issue, but its costs may well be. In many cases, it is simply more
economical to use tried and true practices (and accept but attempt to minimize their attendant human losses) than to invest in risky new technologies. A good example is that of firefighting, an area in which there is little national infrastructure to encourage application of advanced technology research and development, and in which local communities buy apparatus from a myriad of suppliers of conventional trucks and equipment with funds derived from local taxes and bond issues.
The technology needs for teleoperations for hazardous operations can be divided into a number of categories:
Manipulators must today be developed on an application-by-application basis that is well-matched to the task in question. Such manipulators must be made sufficiently precise and sensitive, producing enough feedback that the human operator can control the manipulations appropriately. In addition, the range of tasks that can be performed by manipulators should be expanded so that unanticipated contingencies can be handled with greater ease.
The survivability of manipulators and sensors is obviously important in hostile environments. As an example, radioactive environments can lead to the breakdown of lubricants, the deterioration of electronic components, and the darkening of glass lenses. The dirt and grime of mining environments are hard on optics and other sensors and on actuator joints and the bearings of robotic arms.
Cost continues to be a major issue. As noted above, the cost-effectiveness of teleoperated systems remains to be demonstrated, and until it is, the cost of building and testing teleoperated systems will remain high and demand will remain low.
Using sensors to construct a model of a disposal tank's content is challenging, particularly when dynamics as well as geometry are to be simulated.
During actual operations, the incoming sensory information has to be fused with the current model of the tank.
Higher-level supervisory control and partial autonomy are required to speed up the operations and ensure safety.
Requirements for training are ubiquitous in today's world. They range from general childhood education to highly specialized training of military special forces in preparation for specific critical missions. Our purpose here is to consider the application of VEs for training, that is, for the acquisition of special skills for specific purposes.
The natural precursor to VE, technology-based task simulation, came
into being in the late 1930s. The archetype for such training devices was the Link Trainer, essentially a plywood box on a gimballed stand. The inside of the box was furnished like the cockpit of an airplane, complete with a functioning instrument panel. When occupied by a trainee, a large hood was dropped over the canopy area so the trainee's view of the external world was cut off. Activation of the instruments was controlled electronically by an instructor, and each trainee action was recorded in the same manner. By operating the simulator, the trainee could practice instrument navigation skills in ways that could not or would not be duplicated in the real world because of their hazardous qualities. While having relatively modest fidelity compared with present-day flight simulators, the Link Trainer incorporated nearly all of the training concepts that are featured in the most advanced training devices (Williams and Flexman, 1949). After World War II, the ideas of dynamic task simulation were extended into other task environments: air defense (Chapman et al., 1959), air traffic control (Fitts et al., 1958), submarine combat (McLane and Wolf, 1965), and the operation of surface vehicles such as tanks (Denenberg, 1954).
Meanwhile, computer programs were devised that could represent contexts as varied as multifirm commercial markets (Kibbee et al., 1961) and the processes of municipal governments (Guetzkow, 1962). Developers formulated interactive video representations of such complex situations as the emergency room in a large hospital in a way that the trainee would see the patient, see and hear the actions and questions of other members of the health care team, and, most dramatically, experience the consequences of his or her own decisions in the form of medical outcomes.
The advantages of relatively low cost, low hazard, speed, repeatability, and good transferability of acquired skills were the driving force for progress in simulation-based training. These qualities are likely to be even more prominent in mature VE-based training facilities. Indeed, the qualities that VE might bring to the whole enterprise of dynamic task simulation are those explicitly called for in the most comprehensive historical review of the use of simulation for pilot training (Orlansky and String, 1977).
The idea of using VE technologies for training is a very natural extension of the use of simulation for training. Given the committee's view of VE as simply an extension of the concepts of simulation to include closer coupling between the participant and the technology supporting the creation of the artificial world, it is only natural that training, a subject that has benefited greatly from advances in simulation, should be a prime candidate for exploration of VE technology.
Virtual environments have the potential to extend the scope of circumstances that can be satisfactorily simulated and to extend the advantages of using simulation versus real-world training. When the visual world is presented to each individual on a personally mounted visual display, when the auditory environment is re-created with perceptually compatible virtual sound sources, when sensorimotor interaction is accomplished through an effector interface that simulates the touch and feel of live interaction without requiring a physical hardware mock-up, then the definition and scope of what can be represented is limited not by what can be presented in a physical mock-up, but only by the quality of the sensors and effectors, the speed and processing power of the supporting computer, and the bandwidth of the transmission channels.
The issue of what can be trained with simulation has therefore moved from the physical world to the potential worlds that can be created with software and general-purpose hardware. At this point in time we are not only uncertain of exactly what those boundaries are, but also confident that they are expanding with each new technological development in the field. When VE technology is mature, a suite of hardware will be able to be converted from a nuclear physics training simulation to a mission rehearsal application or a surgical simulator, simply with the change of software application.
VE-based training also has the potential for being better than conventional training. For example, realistic training for hazardous, risky, and dangerous emergency situations, in which the trainee feels that he or she is present in the simulated environment, can be undertaken in ways not possible with conventional training. Artificial cues, not realizable dynamically in the physical world, may be utilized to augment the training effectiveness of VE worlds.
Lastly, VE-based training may be more cost-effective. Current experience with simulators is instructive. Simulators are very expensive because they require the fabrication or acquisition of detailed mock-ups of the participants' actual operating environments. Nevertheless they are extremely cost-effective when compared with equivalent training in the corresponding real environment. (One example—the cost of a 747 flight approaches $10,000 per hour, whereas today's 747 simulator can be run for less than a tenth of that cost.) If research demonstrates that VE-based training is as good as other methods, then the economics of running simulators versus mock-ups and test equipment and vehicles may be such that the former is much cheaper than the latter.
It is presumptive to assume that serious applications of VE to training will await the clear experimental demonstration of cost-effectiveness. Most training techniques in place today did not have the benefit of such evaluation before they were adopted. There is little doubt that VE training will be advocated and perhaps even adopted on the basis of its face validity alone. However, the scientific community should not be satisfied simply with proof from the marketplace.
The committee anticipates that training will be a powerful and useful early application of VE. Its success in limited applications, such as passive simulation scenarios, involving three-dimensional demonstrations that are difficult to create in a two-dimensional world, will be the first areas of successful application. Then, as haptic interfaces emerge, at first with physical manipulators and later with general-purpose earth-referenced devices or with exoskeletons, the full potential for cost-effective applications to training will emerge. At this point we should be able to forecast applications for which VE technology will be successful and those for which it will not. In the long range, we envision a full range of interactive portable training simulators both for operational activities and for maintenance activities.
Commercial aircraft simulators substitute for real aircraft in training flight crews with respect to standard and emergency procedures in their current aircraft and in qualifying them to transfer to a new aircraft type. Every nuclear power plant in the United States is required to have a simulator on site for regular training of the control-room crews. Aspiring doctors practice surgery on cadavers, a medical simulation of a real patient.
Although there is a massive literature on the use of simulation for training, the extension to VE technology has only begun. For now let us address the experimental literature on VE training. We are aware of one as yet unpublished and two published experiments; however, we can expect to see many more in the near future.
Wes Regian had conducted a series of experiments on the use of VE training for two applications: (1) learning to navigate an unfamiliar building and (2) learning to operate a simple control panel (Regian and Shebilske, 1992; W. E. Regian, personal communication, 1993). He has found conclusively that VE training is as good as training in the actual environment when the performance measures are success in finding one's way in the building unaided by encumbrances or artificial cues or operating
a physical realization of the control panel. In further experiments in this series, as yet unpublished, Regian is investigating what can be accomplished with a similar training paradigm when a two-dimensional display is used as a control condition.
Peter Hancock and his students at the University of Minnesota (Kozak et al., 1993) conducted an experiment that purported to evaluate the usefulness of virtual environments for training simple pick and place movements in the laboratory. The task he chose for training was the movement of five soda cans arranged in a row, one by one, to a position about 6 inches to the rear of their prior position and then to return them to their original position. The subjects in one control condition practiced moving real cans while a second control condition involved no training. The experimental condition practiced using a VPL Eyephone system and data glove to move virtual cans. The transfer condition required moving the real cans as rapidly and accurately as possible.
Not surprisingly, early in the transfer trials they found that the virtual training condition was worse than training with the real cans when real movement of the cans was the criterion task. The resolution and viewing angle of the virtual display were inferior to those involving the real cans. The data glove representation of the hand projected in the virtual world provided degraded positioning accuracy compared with a real hand and real cans. And the visual-motor coordination lagged behind the actual hand movements under the conditions that used as the criterion the time for speeded movement. The task required during virtual training was simply very different from the criterion task.
The conditions of this experiment were a poor choice for demonstrating successful VE training. The task of picking up and moving soda cans is so highly overlearned even before beginning the experiment, that all that is left to learn is the very context-specific features of the situation. Nevertheless, the experiment illustrates one of the biggest barriers to the application of VE to training. The fidelity of the visual-motor representation was so poor that the training condition did not correspond closely enough to the conditions required for transfer to even hope for positive transfer.
Lochlan Magee and his colleagues of the Canadian Defense and Civil Institute of Environmental Medicine have designed a VE system for training naval officers in the piloting tasks of maneuvering ships in formation on the open sea. The officers currently practice on real ships at considerable cost of operation. Magee conducted an experiment, as yet unpublished, that compared land-based VE training (using a head-mounted display of the visual scene and Polhemus head sensing to stabilize the imagery) with training at sea. Both groups of approximately 13 junior officers transferred to performance at sea. He allowed the instructors to
utilize the VE system in the same way that they used the real ships. They made no attempt to augment the training with artificial cues or special conditions that could be accomplished only in the VE system. His data analysis is still incomplete, but preliminary results show that the instructor ratings during the transfer condition for the simplest maneuvers studied were significantly better with training using the VE system than with training at sea, and the two training conditions were not significantly different for the two more difficult classes of maneuvers. Because of the very great cost differences between the two training methods, showing equivalence is more than good enough to support use of the VE system.
The major concern in designing any training experience is how well the knowledge and skills acquired in the training environment transfer to the working or operational environment. The two theories that have dominated the thinking about transfer since the turn of the century grew out of the behaviorist tradition. The earliest of these was proposed by Thorndike and Woodworth in 1903 (Thorndike, 1903). This theory, known as the theory of identical elements, is based on the notion that the degree of transfer is a function of the identity of stimulus-response pairs between the original (training) task and the transfer task. That is, the task similarity is determined by the number of shared elements. This model can be used to give qualitative predictions of positive transfer; however, it does not address negative transfer.
The second theoretical formulation that was offered by Osgood in 1949 proposed that the amount of transfer is a function of the degree of similarity between stimuli and responses in the original and the training tasks. In this theory the predictions are qualitative and continuous for both positive and negative transfer. For VE training, the qualitative predictions based on Osgood's model are: (1) maximum positive transfer will be obtained with stimulus and response identity between original and transfer learning, (2) if there is complete response identity, negative transfer cannot be generated, and (3) even with maximum identity of stimuli between original and transfer learning, negative transfer can still be obtained if the responses are antagonistic or in opposition to one another.
A drawback of these approaches is the difficulty of identifying and defining similar elements and in determining the amount of contribution each element makes to transfer. Clearly, both theories are most appropriate for the training of discrete, concrete, and simple tasks that involve the acquisition of motor skills.
More recently, with the shift to cognitive approaches, interest has
begun to focus on the process by which learning occurs and on the development of models to explain that process. Many of these models include not only a representation of the knowledge to be acquired but also a set of rules for using the knowledge. For the past 10 years, Anderson (1993) has been using such a model to build intelligent tutors to teach content and procedural knowledge in algebra, geometry, and computer programming.
There is some indication from transfer of training research that single theories of transfer will not hold for both cognitive and motor tasks (Schmidt and Young, 1987; Ritchie and Muckler, 1954). For complex simulations of the type represented by VE technology, in which complex tasks may be learned and transferred, Hays and Singer (1989:319) suggest starting with fidelity analysis: ''the first step in the fidelity analysis should be to determine the major emphasis (either cognitive or psycho-motor) of the task. If the task is cognitively oriented, it is likely that the training systems should emphasize functional fidelity. On the other hand, if the task has strong psycho-motor requirements, physical fidelity should be emphasized." Functional fidelity refers to the accuracy of representation of the system's procedures and procedural sequences; physical fidelity refers to the accuracy of representation of the physical design and layout of the system on which the individual will receive training.
VE may be particularly suited to increasing the probability of transfer because of its flexibility, its feedback capabilities, and its potential for motivating the learner. The promise of flexibility may make it possible to design individually tailored training experiences that take into account individual differences in the skill and knowledge level of trainees as they enter and proceed through training. For example, a particular training program could be made more compatible with the specific motor skills of each trainee; as proficiency was gained, the training scenarios could be modified accordingly. Further, VE technology is better suited than previous technology to augment feedback to the trainee by adding special cues or by providing multimodal stimulation (e.g., haptic, visual, and auditory). This may be most useful as a reinforcment strategy for training lower-skilled students. Finally, VE technology has the potential to furnish an intrinsically interesting and motivating training environment through the presentation of special sound effects and interesting visual patterns.
Currently there appears to be no way to predict qualitatively or quantitatively the kinds of transfer that will result from VE. As a result, evaluation will be needed during all stages of system design, development, and implementation. The following section discusses the problems associated with quantitative demonstrations of transfer of training effectiveness in the laboratory and the field.
Issues to be Addressed
There are two limitations to demonstrating success in VE training. First are limitations inherent in experimental demonstration of success in the training of almost anything. That is not to say that simulation has not been a successful training medium. It has, as measured in the number of simulators sold, the number of disciplines that have committed to simulation as an alternative to training in the real environment, and the number of organizations that have enthusiastically adopted it as the training medium of choice. Commercial pilots routinely use simulators to maintain their currency in selected aircraft. NASA used virtual reality to train astronauts and other personnel in preparation for the Hubbell telescope repair mission that involved extensive extravehicular activity. What has been difficult to provide are quantitative demonstrations of transfer of training effectiveness in either the laboratory or the field.
Let us examine some of the issues that must be considered when conducting meaningful training evaluations. Since in this section we are concerned with the use of VE to support training rather than performance, we presume that, after training is complete, the trainees will begin work on the same or similar tasks in a different, non-VE environment. The issue therefore is transfer of training from the VE conditions to the non-VE conditions.
For purposes of illustration, let us assume that we are training firefighters. We place a VE helmet-mounted display of the visual scene of a fire on the trainee, together with a treadmill that allows us to simulate walking into a fire. The heat of the fire is represented by adjustable infrared lamps. The trainee then practices the procedures associated with fighting fires. After a period of training, we observe the trainee's performance fighting real fires. We hope that, for most of the required skills, there is positive transfer, that is, that trainee performance after practicing in the VE is better at fighting real fires than it was before training. We would expect that for some skills there would be no transfer, perhaps because they were skills for which the VE system could not provide practice, such as climbing ladders or manipulating hoses. However, there is always the risk of negative transfer, that is, that selected firefighting skills will be performed more poorly after practice in the VE. We would expect negative transfer under conditions in which students learn the wrong response associations to the stimuli to which they are exposed. For example, suppose the trainee approaches closer to the flames in the VE than he or she should because the VE fire leaves out the risk of getting burned. To the extent that the trainee gets closer than she should to the flames in a real fire for the same heat stimulus, we would say that the training exhibited negative transfer. A bad response was learned.
It is not enough simply to show that the training improved performance without asking, "Compared with what?" Firefighters are trained today in the classroom, using controlled fires in mock-ups or real buildings, or both. Unless we can show that the VE training is cost-effective compared with the current methods, we have not accomplished a useful result. This is usually accomplished by training a control group using the standard method and comparing the transfer of training to the real environment to that obtained with the experimental method.
It is likely in this example that there are some firefighting skills for which it is cost-effective to train in an VE, such as communication and coordination among personnel in firefighting teams, whereas others, such as the physical skills of using hoses and climbing ladders, are best trained in mock-ups.
It is very expensive and time-consuming to conduct the kind of transfer of training experiment that is described here. Efficient experiments require that each trainee-subject perform the task twice, once using the experimental method and once using the control method. But there is a dilemma: it is not sensible to compare performance with the experimental method with performance when the subject is trained again on the same task. Human variability being what it is, large numbers of different subjects must be tested in each condition in order to obtain statistically reliable results. Furthermore, because we would expect some skills to show transfer and others not, it is important to develop detailed performance measurement at the level of individual skills so that those that produce positive transfer are detectable.
Because of the cost and difficulty of conducting such studies, government-sponsored experiments of this kind are needed to evaluate scientifically the benefits of VE training.
A second limitation of the current state of virtual environment technology is the problem of successfully representing the real world with satisfactory fidelity to achieve a training goal. There is a trade-off between the field of view and the resolution of helmet-mounted displays. There are limitations to the technology supporting the computation and display of virtual images. There are limitations in the acceptability of computational delays that result from dynamically generated, rapidly changing, complex scenes. The hardware and software supporting the haptic interfaces, which allow the user to touch and feel the objects with which he or she is interacting, have until now been limited to data gloves that can sense only approximately the position of the hands and fingers and provide no sensory feedback about the objects with which the user in interacting. Technology problems also include the technology supporting the computation and display of virtual images.
In thinking about other barriers, it is possible to ask why existing
simulation techniques are not more widely used in training. Despite the number of successful applications cited above, it is surprising that the use of simulation is not more general. Although our review does not support a definitive answer, we can speculate that, until the last few years, the cost of computers and of the special hardware required to support realistic displays, together with the inflexibility embodied in special-purpose hardware, made them not cost-effective for many training applications. For example, until recently, driving simulators may have been too expensive to be cost-effective when compared with the cost of training in a real vehicle. This is now changing, and there is growing interest in this and related applications.
At this stage in the development of VE, the lack of demonstration of cost-effectiveness is a clear barrier, but this stems as much from the lack of demonstrated effectiveness as from high cost.
There is a need to mature the state of VE technology in order to broaden the range of training applications for which it is cost-effective. We run the serious risk of trying to test potential training applications and failing, not because an application is bad, or even one that would not ultimately show usefulness, but because the state of the technology was not yet mature enough to support it effectively.
We also have a need for improved methodologies for evaluating cost-effectiveness. We can just as easily fail because we did not conduct an experiment that was sensitive enough to reliably detect the potential gains that were presumed to be there. This can happen for reasons of inadequate control of the experiment or because of an injudicious choice of performance measures.
Finally, since research of this kind has just begun, there is a need to develop a taxonomy of application areas that promise high leverage in VE training effectiveness. It seems likely that tasks requiring spatial learning or awareness are good candidates. For example, in isolating faults in the avionics of highly complex aircraft, the technician has to conduct the logical fault-isolation task and also map the location of the fault on a schematic diagram to the spatial location of the box containing the appropriate circuit on the actual aircraft. For today's aircraft, that is a very difficult task. It also seems likely that tasks for which enhanced artificial and spatially distributed cues could enhance learning could be important applications. The oft-referred to demonstration of illustrating the molecular forces associated with the position of atoms in a carbon-chain molecule could provide a compelling lesson in molecular structure. Such a taxonomy of application areas should not be based on speculation, however;
it should emerge from a review of a large number of empirical tests of the success of VE training, tests that have yet to be accomplished.
The problems of American education are too well known to need elaboration here. Our children are not sufficiently engaged in learning the skills and concepts presented to them by our schools, and the learning opportunities afforded are not always well matched with our society's future needs (Secretary's Commission on Achieving Necessary Skills, 1991). Synthetic environments may offer opportunities to address both concerns, within the public education system as well as in other arenas, such as home-based entertainment and communications. For students these opportunities include:
visiting simulations of ancient India or Greece, the Paleozoic era or the inner ear, to gather data for presentations, plays, and virtual world-building of their own;
observing ("job-shadowing") adults working in information space, such as CAD artists and researchers using databases (such as medical librarians, historians, etc.);
remotely experiencing such phenomena as the eruption of an underwater volcano or the live birth of an elephant at the zoo in Beijing;
experimenting with simulations constructed by experts (e.g., physicists, ecologists, social and econometric model builders);
working with other students of different ages and cultures, at different sites internationally, on a daily basis, to improve one another's language skills, or on tasks like those described above;
building improved tools for their own use and use by others, such as libraries of images, elemental simulations, stories, local history, and demographic and geographic data.
In essence, students using virtual reality would be able to do what we would like students to be doing today—but with vastly expanded ability to access information in the larger world, to experiment, visualize, and understand, and to interpret the information to their own ends.
Education is an application area that cuts across subject-specific domains; to the extent that a person can learn something using a VE system developed for any specific area, VE is being used for educational purposes. For example, a scientific visualization of a computer simulation that teaches a researcher something new about nature is arguably an
educational application. However, for purposes of this discussion, educational applications of VE will be focused on the potential use of VE in grades K-12, in which improvements are of demonstrable and pressing national concern.
Most commentators on the goals of K-12 education would agree that it should develop a student's capacity to think independently; increase a student's desire and motivation to learn; and increase the extent to which a student learns and retains specific skills and knowledge. In contrast, there is no unanimity about how to create an environment in which these things happen. For specificity, the discussion that follows is guided by the philosophy that people learn best when they can integrate what they are learning into the broader context of other things they know and care about; that they are more highly motivated when they can and have reason to influence the course of their own learning; and that they learn to think independently when they are given substantial opportunities for doing so. Much of this educational philosophy has been characterized as constructionism.
Constructionism is a theory of instructional design, based on constructivist theory. Constructivism is a school of thought among developmental psychologists (Carey, 1987; Piaget and Inhelder, 1967) that concerns the way in which children develop models of the world. The idea is that the essential steps toward a mature understanding of a particular subject include a series of differentiations and reintegrations of experiences involving the dissection and reconstruction of internal models. Within this philosophical framework, computer and other information technologies, specifically VE, may have important roles to play in improving education. In particular:
VE is a potential vehicle through which the range of experience to which students are exposed could be vastly increased.
VE can provide immersive and interactive environments that provide macro contexts in which interesting intellectual problems naturally arise.3
VE potentially provides micro worlds in which students can exercise the skills and use the knowledge they learn.
VE potentially expands the peer group among which collaborative learning experiences are possible.4
The technology of VE, augmented reality, and telepresence is too new to have any real educational applications, if real is taken to mean an application that is fully integrated into the intellectual substance of a nonexperimental curriculum. The following discussion focuses on a number of applications of VE technology to education that are suggested by preliminary experiments and field trials to date: field trips and telepresence, spatial relations (real space or phase space), playrooms to build things, micro worlds, simulations of things that are too complex or expensive to experience or experiment with, and new conceptualization tools for traditional subjects.
Each section below addresses a long-range vision of how VE might assist in these applications and a description of possible near-term demonstrations.
Simulated Field Trips and Telepresence
Schools often use field trips to expose students to unfamiliar physical environments (e.g., an inner-city class may visit a farm). However, cost, convenience, and safety can limit such opportunities for students. VE technology could provide immersive display systems that would enable students to experience exotic environments—in museum dioramas, in microscopic worlds (see Taylor et al., 1993), and in remote and hazardous surroundings.
One example, the Jason Project (see Tyre, 1989; Ullman, 1993), a Massachusetts nonprofit enterprise developed by ocean explorer and geologist Robert Ballard, is built around a remotely controlled submarine (Jason). Live video images of grey whales in the ocean off Mexico and hydrothermal vents in the ocean floor were broadcast from Jason to sites in schools and universities, where nearly 1 million students could see underwater exploration as it was occurring and experience a sense of immediacy and involvement. The Jason project also permitted 23 students to participate directly and interactively. Specifically, these participants
were directly involved in the control of the robot's motion in real time, thus creating an experience of telepresence.
In the summers of 1991 and 1992, the Pacific Science Center in Seattle, Washington, sponsored a Technology Academy Program in which groups of students ages 9 to 15 were introduced to some aspects of SE. Six one-week day camps were held each summer. Participation consisted of seeing videotapes and demonstrations and working with CAD software (Swivel 3D) on Macintosh computers to construct graphical models for use in a VE. These models were taken to the Human Interface Technology Laboratory at the University of Washington for installation by graduate assistants into the laboratory's SE system, which incorporated gloves, helmet, and sound.
Project managers Meredith Bricken and Chris Byrne (1992) reported that students showed a high degree of comprehension and rapid learning of computer graphics concepts such as Cartesian coordinates and three-dimensional modeling. Students indicated that they would much rather work and play with SE than with conventional video games or TV. However, the workshops were only one week long, and the students' experience of immersive SE consisted of parts of one day in which they saw their three-dimensional models in the SE system.
In spring 1992, Intel Corporation sponsored the creation of an exhibit at the Boston Computer Museum, which featured a two-participant virtual world. This world was based on Sense8's WorldToolKit software and Intel's DVI graphics cards, running in 486 PCs; the actual demonstration was constructed by Sense8's Ken Pimintel and Brian Blau at the University of Central Florida (Pimentel and Blau, 1994). Hundreds of people experienced the rather simple virtual world, in which it was possible to grasp building blocks using a three-dimensional wandlike pointing device and slide the blocks around to assemble one's own toy house.
The Boston Computer Museum invented a kind of video swivel chair (for which a patent application has been submitted). The chair carries a 13-inch TV monitor and allows the participant to look around in the virtual world. The imagery rotates to correspond to the chair's rotation. This approach avoids the sanitary problems and expense of providing stereo head-mounted displays for visitors. This kind of technology will probably also be useful in school settings.
The AutoDesk Cyberspace project has supported several experiments concerning multimedia tools in the public schools of Novato, Calif. In 1990, Mark Merickel of Oregon State University conducted experiments in the Olive Elementary School. One team used AutoSketch and
AutoCAD to develop three-dimensional models; another group explored Cyberspace—AutoDesk's immersive virtual world (Merickel, 1990).
Playrooms to Build Things
Some education researchers feel that learning is enhanced when students can build their own simulations (Harel and Papert, 1991), but the construction of virtual worlds with today's tools is technically challenging. Bowen Loftin at NASA/Johnson and the University of Houston has constructed a virtual physics laboratory, built on NASA's SE tools (Yam, 1993). Loftin is in the process of extending his previous work on intelligent tutoring (Loftin et al., 1991) into a richer virtual environment for science education.
Interactive, text-based, role-playing environments and games have been developing for several years on both the Internet and on bulletin board systems (Bruckman, 1992-1993). Known as MUDs, MOOs and MUSEs, these environments have gained substantial popularity with a certain segment of the population. MUD stands for multi-user dungeon, reflecting the genre's origins in role-playing games; MOO stands for MUD, object-oriented, which contains objects that can be manipulated. MUSE stands for multi-user simulation environment.
In a typical MUD or MOO, a participant types LOOK and receives a textual description of the room or place currently occupied. Any other players at that place can type messages, which are immediately echoed on the screens of all the others at that place. Objects can be created, picked up, dropped, and used. New places can be added to the universe.
MUDs provide a crude sense of space and a lively interaction with other participants. They thus prefigure some of the kinds of interactions that can be expected in fully immersive VEs. Some proponents of MUDs regard them as instances of virtual reality. The popularity of MUDs as educational tools is rapidly growing. They require only a PC, a modem, and access to the Internet. Pantelidis (1993) has provided a substantial bibliography on educational uses of MUDs and other VE systems in education.
Interactive modeling and simulation is being pursued as an educational tool on many fronts (Fuerzeig, 1992). Simulations of physical, biological, and social phenomena can have substantial pedagogic value, especially
when the systems being simulated are otherwise inaccessible to students. Many simulations have been successfully implemented on PCs and used in both K-12 and university environments (see White, 1984; Maxis, 1991; Glenn and Ehman, 1987; Schug and Kepner, 1984). As a research example, Tom DeFanti and collaborators at the University of Illinois at Chicago and the Center for Supercomputing Applications are designing educational applications of the CAVE display system (Cruz-Neira et al., 1992). This system projects images on three walls and the floor and tracks a principal viewer's head position to determine the view direction and content. Other viewers in the CAVE are "along for the ride," which can sometimes be disconcerting. The curricular topics under consideration at present include scientific visualization and non-Euclidean geometry. These topics are at present of more interest to university than to K-12 educators.
Issues to be Addressed
The discussion above is admittedly speculative. Educators and their constituents will have to address issues of serious concern in three areas.
Perhaps the most fundamental question is that of desirability. Under what circumstances does it make sense—even given a technically perfect VE educational application—to use that application in the classroom? The question arises because it is all too easy to imagine a classroom in which the amount of interaction that students have with real blocks, real people, and real situations is reduced in favor of simulated experiences.
Put differently, concrete, manipulable objects enhance children's ability to handle abstractions, as repeatedly demonstrated in Montessori schools. Field trips enhance the realism and relevance of lessons. Why should children be deprived of these tools, in favor of virtual building blocks and virtual field trips? Moreover, unlike real blocks and environments, the rules that govern simulated systems are limited only by the system developer's imagination, and as such are essentially arbitrary. Why is it necessarily desirable for students to have more experience and interaction with such systems?
Ultimately, the intellectual challenge is likely to be learning to see VE as one tool of many in the tool chest of responsible educators. VE can obviously provide some experiences that would not otherwise be possible for students to have, but when a "real" alternative is available, it may make more sense for the latter to have priority. Some judicious mix of hands-on learning, VE experience, and book learning—varying from
classroom to classroom and subject to subject—will demonstrate that these tools can complement each other. Such an approach is suggested by the Vanderbilt group (Cognition and Technology Group at Vanderbilt, 1993); they argue that hands-on and anchored instruction techniques are complementary. They use videotape and videodisk segments to establish the story line, and then use hands-on projects for the students to construct their own understanding. Such a paradigm can also be explored in VE, with or without virtual replacement of the hands-on activities.
Effectiveness and Feasibility
The VE hypothesis for education is that, for certain purposes, well-integrated VE systems will achieve results superior to the use of conventional capabilities. The hallmarks of such success would be (1) significant improvement in students' learning and retention of specific skills and concepts, compared with their response to similar content presented without VEs and (2) significant increases in students' voluntary use of such systems, compared with their response to similar content presented in other ways.
Some reports suggest that the successful introduction of education technology results in a sustainable increase in the enthusiasm of students, which increases the overall chances for educational success (Office of Technology Assessment, 1988). Indeed, the popularity of video games and other such technologies among K-12 students outside the educational context strongly indicates that engagement with technology is not likely to be a significant problem.
However, the eagerness with which students embrace technology should not be taken as an unqualified endorsement of immersive graphics for education. Rather, some part of the enthusiasm may come from the novelty of the medium. It is hard to test the kinds of learning achieved from field trips to exotic VE labs for one-shot viewings of one's geometric models or virtual submarine voyages. What is genuinely problematic is the following issue: To what extent can educators design engaging VE systems (a relatively easy task) that also result in what reasonable people would regard as learning (a much harder one)? To answer this question, much more empirical study is necessary.
The concern about practicality is dominated by cost. A number of studies have established in specific contexts that computers can be cost-effective compared with other means of delivering instruction (Levin et al., 1984; Office of Technology Assessment, 1988). However, a number of other factors intervene to complicate their wider adoption by schools.
First, the introduction of technology seldom decreases costs—at least in public education. Thus, even if the provable incremental performance provided by the technology is cheaper per unit than other possible improvements, political will and funding may not be forthcoming.
Second, cost-benefit analysis for technology must necessarily analyze the teaching of skills and concepts that can be taught by other means. Certain skills (e.g., accessing remote on-line databases) are meaningless without the technology. Most parents want their children to have up-to-date skills.
Schools are not yet able to afford well-supported state-of-the-art personal computers in meaningful numbers, and only recently is there broad acceptance of the idea that computers are useful in education (Office of Technology Assessment, 1988). In 1993 a typical computer purchased for school use cost, with software, about $3,000—essentially the cost of a well-equipped Apple II in 1979. Increased performance expectations have offset the decrease in component prices, and so the cost of low-end personal computers has tended to remain about constant, even as their capabilities have increased.
An entry-level PC-based VE system can be purchased in 1993 for approximately $24,000. If current trends continue, such systems should be available for around $3,000 in three to five years. Current developments in entertainment electronics may advance that time frame somewhat, but display technology is not likely to advance as fast as image generation and simulation technology. We believe that VE will not begin to penetrate schools until its utility for specific purposes generates sufficient interest and desire, and when an individual station of acceptable performance costs $3,000 or less.
Assuming that costs have been adequately addressed, it appears that public opinion would be generally supportive of educational VE applications. But long-time educators have seen education fads come and go; they will need to be convinced that VE gives them usable capabilities that can enhance education. Thus, costs necessarily include those related to training teachers to use new technologies effectively and how to judge when the use of new technologies is appropriate. These costs are high, perhaps comparable to those related to deployment of the hardware itself.
Infrastructure and Research Needs
The development of high-quality software followed the arrival of the PC by more than a decade. Only recently has commercial software become available that meets real education needs beyond drill-and-practice. This is due both to the maturation of a generation of lesson-builders and their firms and to the arrival of mature, well-reasoned national agendas
such as those provided by the National Council of Teachers of Mathematics (1989). The generational shift from emphasizing knowledge to emphasizing competence, represented by the National Council for Teachers of Mathematics curriculum and its siblings, has finally provided educational computing with appropriate subject matter and philosophical focus.
A similar evolution will be necessary for VE in education. A critical mass of competent VE programmers must develop and educators will need to work with the hardware and software over a period of years before clear directions will emerge.
On the technology side, no special issues stand out. This is partly because the specific requirements of VE systems in education cannot be articulated until we have a better understanding of the goals of those systems.
Although the technology research agenda for education is not separate from that of VE in general, a great many questions and issues for education research remain:
The identification and characterization of skill and subject matter domains for which VE-based immersion can be demonstrated to provide clear didactic advantages over equivalent nonimmersive presentations.
The relationship between immersion and nonimmersive representations within a given educational environment, as tools to help students understand their own learning process. Immersive presentations may be more engrossing and lead to intuitive understandings. Two-dimensional or schematic presentations may lead to better abstract understandings.
The development of a variety of means (user interfaces, languages, CAD tools) whereby VE environments can be easily expressed and constructed by lesson designers.
The development of concepts and tools (e.g., telepresence) that can be used by students to facilitate their own model-building within those environments and the educational significance of their use (e.g., as indicated by their ability to embody objective knowledge about the processes of science, economics, history, etc. in their models).
The educational value of role-playing adventure games, anchored learning, and other shared simulation experiences in fostering the development of analytical skills such as problem formulation.
The extent to which various features of the user interface can substitute for or enhance the interpersonal interactions among co-located team members.5
The role of third-party mentors, either explicitly present and visible in the simulation or behind the scenes, in enhancing the learning experience.
With some relatively small modifications, all of these questions and issues can and should be asked of any information technology in an education context. But the three-dimensional immersive environment affords a much richer space than a two-dimensional screen in which to ask and address such questions. Indeed, the richness may well be a significant differentiating feature if answers to these questions depend on the specific technology being considered.
With the rapidly increasing amount, types, and sources of information being produced for scientists, engineers, business executives, and the public, there is a pressing need to develop better presentation techniques and formats to support everyday tasks of exploration, understanding, and decision making. The information explosion, coupled with advances in computer technology, offers some promising opportunities to develop new approaches to visualizing information. Currently we are presented with information in a variety of forms, including text and images in two and three dimensions that can be static or dynamic. Interaction techniques are limited to point-and-click window interfaces that can require as many as 10 to 50 steps to accomplish a single task.
Possibilities for the future include augmented reality, in which a virtual image is superimposed on the real world (e.g., view the pipes inside a wall), and VEs in which the user is immersed in the information and interacts with it in real time. Effectively designed environments should provide individuals with easier access to the critical elements in the information; the ability to view and explore interactions among multiple problem dimensions simultaneously; and the opportunity to examine and test relationships that cannot be presented on a two-dimensional display. As an example, well-presented information may assist decision makers in a manufacturing plant to understand more readily the effects of several variables, such as current economic conditions, environmental impact, materials flow, staff training, and marketing forecasts, on the feasibility and cost of producing a particular product. In another example, VE technology could be used to visualize a proposed factory operation before it is developed. Furthermore, creative use of the technology to present financial data could assist investors in making more informed decisions.
Designing useful visualizations for different types of information in support of different user tasks is an enormous undertaking. One major
area of required work is the development of computational software to manage large and diverse databases in ways that allow users to explore alternatives and make discoveries. This issue is discussed in more detail in Chapter 8, on computer generation of virtual environments and in the section below on scientific visualization.
A second area of required work is the determination of what information should be provided, how it should be formatted, and how we expect the user to interact with it. Researchers have worked for many years to create dynamic two-dimensional information displays for such tasks as monitoring the status of a nuclear power plant, flying a jet aircraft, controlling aircraft traffic, and analyzing complex data. This work has provided some insight into how much information can be absorbed at one time, how it should be organized on the screen, and how frequently it can be updated before the limits of human information processing are exceeded (Ellis, 1993). In addition, cognitive scientists have been exploring the relationship between types of tasks and the most appropriate types of information to support those tasks (Palmiter and Elkerton, 1993).
Although some of the knowledge about human information processing, learning, and problem solving gained when using two-dimensional displays will be of value in designing information displays in three-dimensional environments, we will need to mount a substantial research effort to determine how to use the capabilities of three-dimensional environments effectively. An integral part of these research efforts will be a determination of the most user-friendly and efficient interaction techniques. Other chapters of this book provide a discussion of these issues. Of particular importance will be research into the use of sensory modalities other than vision in increasing or modifying the comprehension of information.
Currently, little progress has been made in the use of virtual or augmented reality for the purposes of information visualization. However, some investigators have begun to explore various aspects of visualization for scientific purposes. A brief description of results in this area are reported below. Many of the problems raised will be pertinent to the design of information presentation for other types of activities.
Scientific visualization (McCormick et al., 1987) is the use of computer graphics to create visual images that aid in the understanding of complex, often massive, numerical representations of scientific concepts or results. Such numerical representations, or datasets, may be the output of numerical simulations as in computational fluid dynamics and or molecular modeling, recorded data as in geological and astronomical applications,
or constructed shapes as in visualizing topological arguments. These simulations may contain high-dimensional data in a three-dimensional volume, and they often vary in time. Different locations in such datasets can exhibit strikingly and interestingly different features, and difficulty in specifying locations will impede exploration. Scientific insight into complex phenomena depends in part on our ability to develop meaningful three-dimensional displays.
Traditionally, scientific visualization has been based on static or animated two-dimensional images that have generally required a significant investment in time and expertise to produce.6 As a result, severe limits have been placed on the number of ways in which a dataset can be explored. That is, an explorer does not know a priori what images are unimportant, but when the effort to produce a visualization is large, there will understandably be a hesitation to produce a picture that is likely to be discarded.
Other problems arise with traditional scientific visualization techniques because they are not well suited to the computational datasets associated with modern engineering simulations. These datasets may be inherently complex, consisting of a time series of three-dimensional volumes, with many parameters at each point. Also, scientists are often interested in behavior induced by these data (i.e., streamlines in a vector field) rather than the data values themselves. Under these circumstances, real-time interactive visualization is likely to pay off, due to the complexity of phenomena that can occur in a three-dimensional volume.
VE technology is a natural match for the analysis of complex, time-varying datasets. Scientific visualization requires the informative display of abstract quantities and concepts, rather than the realistic representation of objects in the real world. Thus, the graphics demands of scientific visualization can be oriented toward accurate, as opposed to realistic, representations. Furthermore, as the phenomena being represented are abstract, a researcher can perform investigations in VE that are impossible or meaningless in the real world. The real-time interactive capabilities
promised by VE can be expected to make a significant difference in these investigations, with the potential to provide: the ability to quickly sample a datasets volume without cluttering the visualization; no penalty for investigating regions that are not expected to be of interest; and the ability to see the relationship between data nearby in space or time without cluttering up the visualization. In short, real-time interaction should encourage exploration.
Just as important, a natural, anthropomorphic three-dimensional VE-based interface can aid the unambiguous display of these structures by providing a rich set of spatial and depth cues. VE input interfaces allow the rapid and intuitive exploration of the volume containing the data, enabling the various phenomena at various places in that volume to be explored, as well as providing simple control of the visualization environment through controls integrated into the environment.
A properly constructed VE-based interface will require very little of the user's attention; it would be used naturally, using pointing and speech commands and directions rather than command-line text input. Someone using such an interface would see an unambiguous three-dimensional display. This would contrast with the current interaction paradigm in scientific visualization, which is based on text or two-dimensional input via graphical user interfaces and two-dimensional projections of three-dimensional scenes.
VE systems for scientific visualization are in many ways like software packages for graphing: tools for displaying and facilitating the interpretation of large datasets. But it is too early to describe a single general-purpose VE system for scientific visualization. At the same time, a number of projects have demonstrated that VR does have significant application potential.
Aeronautical Engineering: The virtual wind tunnel (Bryson and Levit, 1992; Bryson and Gerald-Yamasaki, 1992) uses virtual reality to facilitate the understanding of precomputed simulated flow fields resulting from computational fluid dynamics calculations. The visualization of these computations may be useful to the designers of modern high-performance aircraft. The virtual wind tunnel is expected to be used by aircraft researchers in 1994 and provides a variety of visualization techniques in both single-user and remotely located multiple-user environments.
General Relativity: Virtual Spacetime (Bryson, 1992) is an extension of the virtual wind tunnel in which curved space-times, which are solutions
to Einstein's field equations of gravitation, are visualized using article paths in virtual reality.
Molecular modeling: Molecular docking studies using a VE that included a force-reflecting manipulation device have been performed with the GROPE system at the University of North Carolina at Chapel Hill (Brooks et al., 1990). Investigators employed a head-tracked stereo display in conjunction with the force-feedback arm to investigate how various molecules dock together. These studies have implications for the design of pharmaceuticals.
Scanning Tunneling Microscopy: A VE coupled with a telerobot for the control and display of results from a scanning tunneling microscope called the Nanomanipulator has been developed at the University of North Carolina at Chapel Hill (Taylor et al., 1993). This system uses a head-tracked stereo display in conjunction with a force-feedback arm to display a surface with molecular resolution via graphics and force reflection based on data obtained in near-real time from a scanning tunneling microscope. In addition, there is the ability to deposit very small amounts of material on the surface via direct manipulation by the user.
Medical visualization: Medical visualization systems using augmented reality (e.g., Bajura et al., 1992) have been developed at several sites. The primary difficulty with medical visualization at this time involves the very large amounts of graphic data being displayed. Bajura et al. are designing a system that will map ultrasound imagery in real time onto the physician's view of the real patient, allowing the location of the features shown in the ultrasound imagery to be quickly and intuitively located in the patient.
Astrophysics: A system to investigate cosmic structure formation has been implemented at the National Center for Supercomputing Applications (Song and Norman, 1993). This system visualizes structure arising from simulations of the formation of galaxies in the early universe.
Circuit Design: The Electronic Visualization Laboratory at the University of Illinois, Chicago, has implemented several scientific visualization applications in a virtual environment setting. For descriptions of the individual projects, see Cruz-Neira et al. (1993a, 1993b).
Issues to be Addressed
Experience with VE-based scientific visualizations has shown that in order to sustain usable interaction and to make the user feel that a series of pictures integrates into an insightful animation, a number of criteria must be met. First, the system must provide interactive response times to the user of approximately 0.1 s or less. Interactive response time is a measure of the speed with which the user sees the results of actions; if the
interactive response time is too slow, the user will experience difficulty in precisely placing visualization tools (Sheridan and Ferrill, 1974). Second, effective systems for scientific visualization must have animation rates of at least 10 frames/s. Animation rate is a measure of how fast images are presented to the user; this rate is particularly relevant with respect to viewing control and for time-varying datasets. If the rate is too slow, the images will be perceived as a series of still pictures rather than a continuous evolution or movement. These two parameters are psychologically and perceptually related, albeit computationally distinct. Some VE systems may separate the computation and visualization processes, so that they run asynchronously. We are at the beginning of understanding the potential of this technology for scientists. Research is needed to answer such questions as when are continuous images more useful than discrete images for scientific insight.
Scientific visualization also makes particular demands on virtual reality displays. The phenomenon to be displayed in a scientific visualization application often involves a delicate and detailed structure, requiring high-quality, high-resolution, and full-color displays. Experience has shown that displays with 1,000 × 1,000 pixel resolution are sufficient for many applications. In addition, a wide field of view is often desirable, as it allows the researcher to view how detailed structures are related to larger, more global phenomena.
Lastly, user acceptance criteria suggest that few researchers would be willing to invest the time required to don and doff head-mounted displays available at the time of this writing. Furthermore, many researchers have expressed distaste for donning helmets or strapping displays onto their heads.
TELECOMMUNICATIONS AND TELETRAVEL
As facilitators of distributed collaboration, the applications of telecommunications and teletravel cut across all of the other applications discussed in this chapter. For manufacturing activities of the future, it is anticipated that virtual images of products will be simultaneously shared by geographically dispersed design engineers, sales personnel, and potential customers, thus providing the means for joint discussion and product modification. In health care, there are several examples of distributed collaboration, including remote surgical practice and remote diagnostic consultation among patients, primary care physicians, and specialists who may all be viewing common data or three-dimensional images. An example of the latter is the development of a telemedicine system in Georgia,
in which medical center expertise is shared over a network with rural doctors and their patients.
For education and training, there are many instances in which distributed collaboration may be useful. One example is the use of a shared virtual battlefield for mission planning, rehearsal, and training. Another potential use is offering students from several schools around the country the opportunity to come together through network technology to share a common virtual world—such as a reconstruction of a historic site that no longer exists. Finally, in hazardous operations, distributed collaboration is a central feature of humans and telerobots working together is the same remote environment.
The discussion in this section focuses on the increasingly collaborative nature of modern business and the potential contribution of SE technology to facilitating this collaboration in a cost-effective manner. Specifically, we discuss telecommunication and teletravel. Both of these processes use technology to reduce unproductive travel time to and from meeting sites or regular work sites.
Today, many of those who are knowledge workers already use technology to avoid travel. The nature of a knowledge worker's job is such that by working at home or in satellite locations using personal computers, modems, and the telephone network, these telecommuting workers can perform their jobs reasonably well. Greater understanding and acceptance of this phenomenon in the workplace is illustrated by the response of the Los Angeles work force to the earthquake of January 1994. In the aftermath of the disaster and the consequent disruption of customary commuter routes, telecommuting increased dramatically. However, most workers in the United States do not have jobs that can be performed using only a screen, a keyboard, and a mouse. Most sales people must interact face-to-face, and others, such as craftspeople, work on solid objects with their eyes and hands. Even telecommuting knowledge workers need face-to-face meetings for discussions involving more than two people, job interviews, and many other work situations in which gestures, facial expression, and eye contact are critical components of the interchange. These added task requirements open the door to the next step in the use of VE technology.
The historical evolution of distributed collaboration provides a useful context for this discussion. Study of the paths followed in the earlier efforts in both research and system development can reveal the current robust status of telecommunication facilities, as well as the potential consequences of expanding such facilities to include VE capabilities.
Distributed collaboration first emerged as a product of advanced technology in the form of multiperson telephone conversations. Such services were provided by AT&T during the 1950s with economic benefits for both users and providers. Routine use of this technological capability came in the 1960s, along with expansion of the concept to include secure conferences between remote participants. A separate and special technology, telecommunication, came to be institutionalized as a consequence. Several major systems were developed at this time to serve the needs of the federal government. Among such systems were the first models of AUTOSEVOCOM, a secure network that could support remote conferences between top-level military commanders from their stations around the world (Sinaiko and Belden, 1965). This system also cemented the integration of computers into the multistation communication network. In these early instances, the computer was used only as a switching device and a tool for signal encryption. However, its presence in the system was a definite harbinger of things to come.
In addition, the design effort for military systems set off programs of research intended to explore the potentials, both positive and negative, of computer-mediated communication. For example, studies were initiated to determine the feasibility of using a computer-mediated network to link North Atlantic Treaty Organization member heads of state for purposes of joint crisis resolution. Problems ranging from how to implement rules of diplomatic protocol to overcoming language barriers were explored. The outcomes of these research programs revealed some of the limitations on communication effectiveness imposed by an absence of visual information to augment the direct voice transmissions.
Digitization, packet switching, and optical fibers, among other technological advances, began to open more vistas in the 1970s. It was then that the first instances of teletravel began to appear (Fordyce, 1974; Craig, 1980) after having been forecast several years previously. Entire new areas of economic advantage began to be apparent, such as the possible savings in gasoline use and pollution reduction. Negatives such as lowered productivity due to a lack of supervisory presence were played down by early enthusiasts but have come to be treated as significant matters in present-day application instances. In any case, telecommuting, as a form of distributed collaboration, has become an accepted option for some workers in some organizations (Shirazi, 1991).
The other critical ingredient in the evolution of distributed collaboration was the rapid adoption of small but powerful computers by workers in many different occupations. Computerized networks began to become widespread in the 1980s. In a sense, the computer became an actual participant in multiperson collaborations that were performed on the network. The computer provided an information storage and retrieval capability
that far exceeded what the humans could contribute. Also, the computer could provide a dynamic color graphics capability that is not available by any other means. Object-oriented collaborations, such as designing electronic circuitry, have been quite successful when these technologies have been employed (Sheridan, 1993a; Fanning and Raphael, 1986).
In summary, distributed collaboration is not a particularly new idea. Working in this manner has gradually expanded over the past four decades, as people have accustomed themselves to the concept and its ramifications and as the technology has progressed to a point at which it supports new modes of activity at affordable costs. Now the question becomes: What can or will VE add to the process? Will VE provide the means to take a few more incremental steps in the further expansion of distributed collaboration—or will VE provide the basis for major change in how the concept is actualized?
Teletravel and Virtual Environments
VE offers the possibility for one to participate in a meeting in which all the other attendees were present in the form of virtual images. Each participant in the virtual meeting would see and hear the other participants through lightweight, see-through VE goggles that resemble eyeglasses, while his or her own appearance was captured by a video camera for broadcast to all the others. Different communications channels would support both group communications and communications to selected individuals—the equivalent of whispering to someone.
The social feasibility of virtual meetings goes beyond the technology. The enormous difficulty of scheduling conference phone calls for more than four busy people suggests that a new set of social norms would be needed before virtual meetings could be called routinely. For example, people would have to feel that is was unacceptable to remain in a virtual meeting and attend to other business simultaneously.
Shared workspaces refer to the real or virtual gathering of people at a specific location for the purpose of interacting with an artifact or an object. With the appropriate technology for automatic model generation, any physical space—and the relevant artifacts and objects—could be turned into a virtual meeting place. Thus a group could travel to any location where sensors existed and meet while observing events taking place in that real-world location.
VE also offers the possibility that one could be part of collaborative communities at a distance. For example, Xerox PARC is currently experimenting with technologies that virtually bring together people working in different offices. Although the current effort is limited to small-screen
video and audio, VE-based collaborative communities could offer illusions of physical presence so real that offices of collaborators and colleagues could be geographically dispersed much more than they are today.
To facilitate the illusion of shared office spaces, office workers might wear glasses that could give them the feeling that their offices were part of a much larger common area in which many other participants were present. Each participant would see the other participants sitting at desks in their own offices.
In situations in which a traveler is unable to physically go to the target site, virtual travel may be useful. An example would be to enable inmates of correctional facilities to hold jobs or be trained in the outside world while still being controlled and monitored physically. Police might travel virtually to the middle of a jail riot to gather further intelligence. Researchers, regulators, and site planners might virtually meet in a hazardous environment, such as a nuclear dumping site, to examine conditions and plan operations for the future. It is conceivable that close observation and review of site conditions could provide superior input to planning based on video or still pictures.
Teleoperation and Remote Access
Teleoperated systems controlled through a VE interface would enable an individual or a group to go beyond mere passive observation. For example, a group on a virtual visit to a nuclear power plant could be authorized to open and shut valves and make other changes to the physical status of the plant. Since the valves and other actuators in such a plant are electronically controlled from a control panel, it is not too far-fetched to imagine the sensor data and control of the operation of the plant being part of the virtual world inhabited by the plant's human operators. The operators might be more aware of the status of the plant in emergencies if they could virtually travel through its radioactive corridors.
Any device that was connected to the communications grid could be controlled by a virtually present human. A home security system, for example, could summon home its owner when an alarm went off. (For an example based in telecommunications prior to VE technology, see Taylor, 1980.) With appropriate cameras and sensors, the owner could travel through the house (virtually) to see if intruders were present. With certain actuators present in the home and linked to the network, the owner might either drive off to escape confrontation or try to capture the intruders in the home.
A very general means of making changes to the world would be for a distant person to occupy a telerobot to perform some task. Discussions of
telerobotics tend to treat the link between human operator and telerobot as semipermanent. But if telerobots were common, they might be treated more like telephones—that is, known locations into which one could project one's eyes and hands.
A key aspect of a virtual meeting is the ability to see body movements and facial expressions, a feature difficult to achieve with current video conferencing systems. In real meetings in real places, the participants perceive themselves to be in a place, surrounded by its walls. They are able to observe the positions of other participants within that space and hear their voices coming from specific directions. These perceptual possibilities are not available from video telecommunication systems. Therefore, the use of directional sound in virtual meetings will be especially important. With as few as two pairs of people having simultaneous independent conversations, the conversations will be disrupted without the ability of the hearers to filter out the unwanted sounds. This is done in real life (the "cocktail party effect") by the human ability to selectively filter sounds based on their directionality, and the fact that sounds from more distant sources are less loud. Both of these properties of real sounds can be supported in shared virtual worlds.
To capture an image of a user's facial expression while allowing the user to view the shared virtual world, several display methods are available. One way to do it is for the user to wear a see-through HMD, resembling eyeglasses. A video camera could be trained on the user's face. The see-through capability of the HMD in this case is primarily useful for allowing the camera to see though to the user's face, rather than for allowing the user to see the real surroundings.
Another viewing method is to put the user in an immersive viewing station, similar to the CAVE, a room whose walls, ceiling, and floor surround a viewer with projected images (Cruz-Neira et al., 1992). Since the CAVE uses polarized glasses for stereo, a camera is needed that can see through to collect images of both eyes of the user through the polarization. Both of these viewing setups are adequate to allow a group of people, each at a viewing station, to see and be seen by the other members of a group of people occupying a virtual meeting space.
The two-dimensional image of face and body that would be collected by a normal video camera is adequate but not optimal. The two-dimensional face texture could be mapped onto a virtual mannequin representing the person in the virtual world, and the same could be done for the two-dimensional body image. This would provide a very flat person to the virtual world, but it would still have advantages over video telecommunication,
which does not show the locations of the different participants very effectively.
A more elaborate body image capture method would be to use range-imaging techniques to capture a three-dimensional model of the body and face. Such automatic three-dimensional model acquisition is needed by other branches of the SE field, and various prototype systems exist for range imaging. A three-dimensional image of the body of each participant in a virtual meeting begins to make such meetings sound like they could actually come close to duplicating the perceptual feel of being physically present at a real meeting.
These technologies, though still immature, offer the possibility of electronically projecting oneself, as easily as one currently makes telephone calls, into virtual worlds inhabited by other distant human users, with whom one can have face-to-face interactions both one-on-one and in groups. These shared multiperson virtual worlds create a shared space, in which each human participant has a position, a body image resembling his or her own real appearance, and a viewpoint from which to observe the behaviors and facial expressions of the other people engaged in the transaction.