Setting a Common Research Agenda
The entertainment industry and the U.S. Department of Defense (DOD) are both interested in a number of research areas relevant to modeling and simulation technology. Technologies such as those for immersive simulated environments, networked simulation, standards for interoperability, computer-generated characters, and tools for creating simulated environments are used in both entertainment and defense applications. Each of these areas presents a number of research challenges that members of the entertainment and defense research communities will need to address over the next several years. Some of these areas may be amenable to collaborative or complementary efforts.
This chapter discusses some of the broad technical areas that the defense and entertainment research communities might begin to explore more fully to improve the scientific and technological base for modeling and simulation. Its purpose is not to provide answers to the research questions posed in these areas but to help elucidate the types of problems the entertainment industry and DOD will address in the coming years.
Technologies for Immersive Simulated Environments1
Immersive simulated environments are central to the goals and needs of both the DOD and the entertainment industry. Such environments use a variety of virtual reality (VR) technologies to enable users to directly interact with modeling and simulation systems in an experiential fash-
ion, sensing a range of visual, auditory, and tactile cues and manipulating objects directly with their hands or voice. Such experiential computing systems are best described as a process of using a computer or interacting with a network of computers through a user interface that is experiential rather than cognitive. If a user has to think about the user interface, it is already in the way. Traditional military training systems are experiential computing systems applied to a training problem.
VR technologies can allow people to directly perform tasks and experiments much as they would in the real world. As Jack Thorpe of SAIC pointed out at the workshop, people often learn more by doing and understand more by experiencing than by simple nonparticipatory viewing or hearing information. This is why VR is so appealing to user interface researchers: it provides experience without forcing users to travel through time or space, face physical risks, or violate the laws of physics or rules of engagement. Unfortunately, creating effective experiences with virtual environments is difficult and often expensive. It requires advanced image generators and displays, trackers, input devices, and software.
Experiential Computing in DOD
The most prominent use of experiential computing technology in DOD is in the area of personnel training systems for aircraft and ground vehicles. DOD also has a series of initiatives under way to develop advanced training systems for dismounted infantry that rely on experiential computing. Such programs are gaining increased attention in DOD and will become a primary driver behind the military's efforts to develop and deploy technologies for immersion in synthetic environments. They are being undertaken in coordination with attempts to develop computing, communications, and sensor systems to provide individual soldiers with relevant intelligence information.2 Experiential computing, as applied to flight and tank simulation, is a mature science at DOD. There are a number of organizations that have extensive historical reference information they can draw on in specifying the requirements for new immersive training systems. These organizations include the U.S. Army's Simulation, Training, and Instrumentation Command (STRICOM) and the Naval Air Warfare Center's Training Systems Division. Experiential computing is something that has been essential to military training organizations for decades.
For traditional training and mission rehearsal functions, the current need is to reduce the cost of immersive systems. Existing mission rehearsal systems based on image generators like the Evans and Sutherland ESIG-4000 serve the Army's Special Operations Forces well, allow-
ing them to fly at low altitudes above high-resolution geo-specific terrain for hundreds of miles and enabling them to identify specific landmarks along their planned flight path to guide them on their actual mission. Unfortunately, these dome-oriented trainers used to cost upward of $30 million, making it impractical to either procure many simulators or to train many pilots. Cost reductions would allow more widespread deployment of such systems.
Experiential computing technologies are being used by the U.S. Navy in both training and enhanced visualization. For battleships an advanced battle damage reporting system allows a seaman in the battle bridge to navigate a three-dimensional (3D) model of his ship to identify where damage has occurred and both where the best escape routes would be for trapped seamen and which routes the rescue and repair crews should take. In another Navy application developed at the Naval Command Control and Ocean Surveillance Center's (NCCOSC's) Research, Development, Test, and Evaluation Division (which is referred to as NRaD), submarines are fitted with an immersive system that generates a view of the outside world for the commander when they are submerged. Since submarine crews cannot normally look outside the boat except when it is on the surface, a virtual window outside provides not only a view of the seafloor (created through the use of digital bathymetric data) but of the tactical environment as well, with other ships, submarines, sonobuoys, and sea life represented clearly and spatially for the commander to gain a better understanding of the tactical and navigational situation.
In the nonimmersive domain, experiential computing technology is being leveraged by both the Naval Research Lab (NRL) and the Army Research Lab (ARL) in the form of a stereoscopic table-based display. This display is known at NRL as the Responsive Workbench and at ARL as the Virtual Sandtable. The Responsive Workbench was invented in 1992 at the German National Computer Science and Mathematics Institute outside Bonn. NRL duplicated the bench and started exploring how it could be used in a variety of applications. The concept of the workbench is simple. The bench itself is a table 6 feet long, 4 feet wide, and standing 4 feet off the floor. The tabletop is translucent, and a mirror sits underneath at a 45 degree angle. A projector behind the table shines on the mirror and up onto the table surface from below, creating a stereoscopic image on the tabletop. Users wear stereoscopic glasses and a head tracker. As they move their heads, the image changes to reflect that motion and objects appear to be sitting, like a physical model, on the table.
An Army application of this technology is a re-creation of the traditional sand table in which forces are laid out and move around to plan strategies and tactics or to review a training exercise. Coryphaeus Soft-
ware of Los Gatos, California, is commercializing a similar product, the Advanced Tactical Visualization System, which operates with the commercial version of the Responsive Workbench, the Immersive Workbench by Fakespace Inc. Since commanders are used to working with scale models of battlefields and maps, they can easily accommodate this type of display.
Experiential Computing in the Entertainment Industry
The problem with creating effective experiential computing systems is that they demand real-time graphics. In the entertainment industry, return on investment must be considered. The high cost of immersive technologies has slowed their expansion into entertainment settings. Nevertheless, an increasing number of location-based entertainment attractions and home systems are emerging. The majority of the systems in operation fall into one of three categories: (1) arcade systems, (2) location-based entertainment centers, and (3) VR attractions at theme parks. Location-based entertainment centers and arcades boast both stand-alone systems that allow participants to drive down a race course, ski down a mountain, or play virtual golf. Others have networked together flight simulators that allow players to interactively fly through a virtual environment and engage targets (including each other). Disney has developed a VR attraction based on its film Aladdin, and Universal Studios has developed a ride based on Back to the Future.
Now that the costs of real-time graphics systems are dropping, it is likely that the list of VR experiences for entertainment will expand and that home applications will become more prevalent. Three-dimensional graphics are becoming more widely available on home computers, and the number and variety of peripheral devices, such as throttle-like joysticks and mock-ups of fighter cockpits, are expanding. Continued reductions in cost coupled with increases in capability will likely stimulate further expansion of the home market.
Several areas of experiential computing would benefit from additional research. Much of this work would be applicable to both defense and entertainment applications of experiential computing technology. Technologies for image generation, tracking, perambulation, and virtual presence are of interest to both communities, but research priorities tend to be very different. As an example, the factors guiding development of the microprocessors that form the heart of the new Nintendo 64 game machine are very different from those that DOD would have set were it
specifying a deployable, low-cost, real-time simulation and training device. For example, the Nintendo system was designed for operation in conjunction with a television and uses an interlaced scanning technique and low-resolution graphics. Most training systems would require higher resolution to enable participants to identify more easily specific features of the environment and to avoid eye strain during periods of extended use and would likely use a progressive scan system similar to most computer monitors. Thus, for military purposes it might be possible to leverage a variant of the Nintendo 64 processor, but the actual processor would probably not do the job.
Visual simulations in defense and entertainment applications share a common need for image generators with a range of capabilities and costs. On the entertainment side, low-cost platforms such as personal computers (PCs) and game boxes, such as those manufactured by Sega or Nintendo, underlie the video games industry. PCs also serve as the primary point of entry to the Internet and therefore are critical to companies providing on-line entertainment, whether through so-called chat rooms or multiplayer games. Larger location-based entertainment centers, such as the flight simulator centers operated by Virtual World Entertainment and the Magic Edge, also are interested in moving away from workstation-based simulators to PC-based simulators as a means of reducing operating costs.
Image generation has long benefited from close linkages between the commercial and defense industries. From its early roots at Evans and Sutherland (E&S) and GE Aerospace, the image generator industry responded largely to defense needs because volumes were low and prices high, typically in the millions of dollars. The high cost limited the use of such simulators outside DOD. Nevertheless, the E&S CT5 (circa 1983) and the GE Compuscene 4 Computer Image generators were benchmarks by which all interactive computer graphics systems were measured for years.
At about the same time, interactive 3D graphics began to migrate into commercial applications. Stanford University Professor James Clark and seven of his graduate students founded Silicon Graphics Inc. to bring real-time graphics to a broad range of applications. Other companies soon followed, creating the now-pervasive commercial market for real-time 3D graphics. As a result, image generation capabilities that cost over $1 million in 1990 are now available on the desktop for one-one thousandth (1/1,000) that pricea drop of over three orders of magnitude in less than a decade. This improvement in price/performance ra-
tios results from both technological advances and a related growth in demand for 3D graphics. By driving up production volumes, increased demand has lowered costs significantly, and the entrance of new competitors into the market has accelerated the pace of innovation and resulted in further declines in cost. As real-time 3D becomes a commodity, the true cost of image generation is switching to softwarethe time and resources required to model virtual worlds.
As commercial systems become more capable, more opportunities will exist for DOD and the entertainment industry to work together on image generation capabilities, coupling fidelity with the lower costs that stem from producing larger volumes. A number of existing and emerging technologies could potentially be used for DOD training applications. Low-cost 3D image generators exist that can support robust dynamic 3D environments. These range from game machines such as Nintendo 64 to low-cost graphics boards for PCs manufactured by companies such as 3Dfx and Lockheed Martin.
Improvements in low-cost image generators depend on advances in six underlying technologies: processors, 3D graphics boards, communications bandwidth, storage, operating systems, and graphics software. The commercial computer industry will play the leading role in bringing such technologies to the market but will continue to draw from a larger national technology base created by both public and private research programs. Advances in high-end DOD systems may be able to create capabilities that can be used in less expensive systems. Processing power continues to increase with each new generation of microprocessors. Current microprocessors operate at speeds of 200 megahertz or more, and many include multiprocessor logic that can allow several (typically four to eight) processors to work together on a common problem. In the area of 3D graphics boards, some 30 to 40 companies currently offer boards for PCs. As a result, David Clark of Intel Corporation predicts that the performance of graphics chips (the number of polygons generated per second) may double in performance every nine monthstwice as fast as processors are improving. Inexpensive chips will soon be able to generate upward of 50 million pixels per second with textures. New communications architectures for PC graphics, such as Intel's accelerated graphics port architecture, will enable over 500 megabytes per second of sustained bandwidth, enabling designers to rapidly transfer texture maps from main memory, thus keeping the cost of 3D graphics low. Because of such advances, producers of PC hardware and software see 3D graphics as a growing application area and are moving quickly to commercialize 3D graphics technology. Both Windows NT and UNIX operating systems support PC-based graphics, and a number of software vendors are porting their applications from the workstation to the PC environment.
Multigen Inc. has announced that it is making products available for Windows NT systems; Gemini Corporation has ported the Gemini Visualization System. Microsoft Corporation's purchase of Softimage, manufacturer of high-end graphics creation software used by both DOD and the entertainment industry, promises to accelerate the graphics capabilities of PCs.
One of the areas that has seen insufficient innovation in the past decade, position and orientation tracking, continues to hamper advanced development in experiential computing. Today's tracking systems include optical, magnetic, and acoustic systems. The most popular trackers are AC or DC magnetic systems from, respectively, Polhemus Corporation and Ascension Technologies. These systems have fairly high latency, marginal accuracy, moderate noise levels, and limited range. New untethered tracking systems from Ascension help with the intrusive nature of being wired up but still require the user to wear a large magnet.
Tracking remains a barrier to free-roaming experiences in virtual environments. To meet the goals of the U.S. Army's STRICOM for training dismounted infantry, long tracker range, resistance to environmental effects from light and sound, and minimal intrusion are key to assuring that the tracking does not get in the way of effective training (see position paper by Traci Jones in Appendix D). Similar requirements were expressed at the workshop by Scott Watson of Walt Disney Imagineering. Magnetic tracking is currently used for detecting head position and orientation in Disney's Aladdin experience and other attractions, despite the fact that the latency of such systems is roughly 100 millisecondslong enough to contribute to symptoms of simulator sickness.3
As the performance of graphics engines rendering virtual environments increases, the proportional effect of tracker lag is increased. Some optical-based trackers are currently yielding good results but have some problems with excessive weight and directional and environmental sensitivity. Experiments with novel tracking technologies based on tiny lasers are showing promise, but much more work needs to be done before untethered long-range trackers with six degrees of freedom are broadly available in the commercial domain.
While untethering the tracker is a current next-step goal, the ideal tracker would not only be untethered but also unobtrusive. Any device that must be worn or held is intrusive, as it intrudes on the personal space of the individual. All current tracking systems suffer from this problem except for some limited-functionality video tracking systems.
Video recognition systems are typical examples of unobtrusive trackers, allowing users to be tracked without requiring them to wear anything (except for the University of North Carolina video tracker, which actually had users wear cameras!). While this is an ideal, it is difficult to effectively implement and thus has seen only limited application. Some examples include Myron Krueger's VideoPlace and Vincent John Vincent's Mandala system.
Improved technologies are also necessary for supporting perambulation in virtual environments. The U.S. Army's STRICOM has funded the development of an omni-directional treadmill to explore issues associated with implementing perambulation in virtual environments, a topic that is applicable to entertainment applications of VR as well. Allowing participants in a virtual environment to wander around, explore, and become part of a story would greatly enhance the entertainment value of the attraction. It would also enable residents of a particular neighborhood to wander around synthetic re-creations of their neighborhoods to see how a proposed development nearby would affect their area, from a natural perspective and with a natural user interface. Research is needed to improve current designs and to create perambulatory interfaces that allow users to fully explore a virtual environment with floors of different textures, lumps, hills, obstructions, and other elements that cannot easily be simulated using a treadmill.
Technologies for Virtual Presence4
Virtual presence is the subjective sense of being physically present in one environment when actually present in another environment.5 Researchers in VR have hypothesized the importance of inducing a feeling of presence in individuals experiencing virtual environments if they are to perform their intended tasks effectively. Creating this sense of presence is not well understood at this time, but among its potential benefits may be (1) providing the specific cues required for task performance, (2) motivating participants to perform to the best of their abilities, and (3) providing an overall experience similar enough to the real world that it elicits the conditioned or desired response while in the real world. Several technologies may contribute to virtual presence.
• Visual stimulus. This is the primary means to foster presence in most of today's simulators. However, because of insufficient consideration of the impact of granularity, texture, and style in graphics
rendering, the inherent capability of the available hardware is not utilized to the greatest effect. One potential area of collaboration could be to investigate the concepts of visual stimulus requirements and the various design approaches to improve graphics-rendering devices to satisfy these requirements.
• Hearing and 3D sound. DOD has initiated numerous efforts to improve the production of 3D sound techniques, but it has not yet been effectively used in military simulations. Providing more realistic sound in a synthetic environment can improve the fidelity of the sensory cues perceived by participants in a simulation and help them forget they are in a virtual simulated environment.
• Olfactory stimulus. Smell can contribute to task performance in certain situations and can contribute to a full sense of presence in a synthetic environment. There are certain distinctive smells that serve as cues for task initiation. A smoldering electrical fire can be used to trigger certain concerns by individuals participating in a training simulator. In addition, smells such as that of hydraulic fluid can enhance a synthetic environment to the extent that it creates a sense of danger.
• Vibrotactile and electrotactile displays. Another sense that can be involved to create an enhanced synthetic environment is touch and feel. Current simulator design has concentrated on moving the entire training platform while often ignoring the importance of surface temperature and vibration in creating a realistic environment.
• Coherent stimuli. One area that has not received much research is the required coherent application of the above-listed stimulations to create an enhanced synthetic environment. Although each stimulation may be valid in isolation, the real challenge is the correct level and intensity of combined stimulations.
Part of making a simulated experience engaging and realistic has nothing to do with the fidelity of the simulation or the technological feats involved in producing high-resolution graphics and science-based modeling of objects and their interactions. These qualities are certainly important, but they must be accompanied by skilled storytelling techniques that help participants in a virtual environment sense that they are in a real environment and behave accordingly. "The problem we are trying to solve here is not exactly a problem of simulation," stated Danny Hillis at the workshop. "It is a problem of stimulation." The problem is to use the simulation experience to help participants learn to make the right decisions and take the right actions.
The entertainment industry has considerable experience in creating
simulated experiencessuch as films and gamesthat engage participants and enable them to suspend their disbelief about the reality of the scenario. These techniques involve methods of storytelling, of developing an engaging story and using technical and nontechnical mechanisms to enforce the emotional aspects. As Danny Hillis observed:
If you want to make somebody frightened, it is not sufficient to show them a frightening picture. You have to spend a lot of time setting them up with the right music, with cues, with camera angles, things like that, so that you are emotionally preparing them, cueing them, getting them ready to be frightened so that when you put that frightening picture up, they are startled.
Understanding such techniques will become increasingly important in applications of modeling and simulation in both DOD and the entertainment industry. Alex Seiden of Industrial Light and Magic observed at the workshop that "any art, particularly film, succeeds when the audience forgets itself and is transported into another world." The technology used to create the simulation (such as special effects for films) must serve the story and be driven by it.
DOD recognizes the importance of storytelling in its large-scale simulations. Judith Dahmann of DMSO noted that DOD prepares participants for simulations by laying out the scenario in terms of the starting conditions: Who is the enemy? What is the situation? What resources are available? However, DOD may be able to learn additional lessons from the entertainment industry regarding the types of sensory cues that can help engender the desired emotional response.
One of the primary issues that must be considered in both entertainment and defense applications of modeling and simulation technology is achieving the desired level of fidelity. How closely must simulators mimic the behavior of real systems in order to make them useful training devices? Designing systems that provide high levels of fidelity can be prohibitively costly, and, as discussed above, the additional levels of fidelity may not greatly improve the simulated experience. As a result, simulation designers often employ a technique called selective fidelity in which they concentrate resources on improving the fidelity of those parts of a simulation that will have the greatest effect on a participant's experience and accept lower levels of fidelity in other parts of the simulation.
Developers of DOD's Simulator Networking (SIMNET) system, a distributed system for real-time simulation of battle engagements and war games, recognized that they could not fool trainees into actually believ-
ing they were in tanks in battle and put their resources where they thought they would do the most good.6 They adopted an approach of selective fidelity in which only the details that proved to be important in shaping behavior would be replicated. Success was measured as the degree to which trainees' behavior resembled that of actual tank crews. As a result, the inside of the SIMNET simulator has only a minimal number of dials and gauges; emphasis was placed on providing sound and the low-frequency rumble of the tank, delivered directly to the driver's seat to create the sense of driving over uneven terrain. Though users initially reported dismay at the apparent lack of fidelity, they accepted the simulator and found it highly realistic after interacting with it.7
The entertainment industry has considerable experience in developing systems that use selective fidelity to create believable experiences that minimize costs. Game developers constantly strive to produce realistic games at prices appropriate for the consumer market. They do so by concentrating resources on those parts of their games most important to the simulation. After realizing that game players spent little time looking at the controls in a flight simulator, for example, Spectrum HoloByte shifted resources to improving the fidelity of the view out the window.8 Experiments have shown that even in higher-fidelity systems the experience can be improved by telling a preimmersion background story and by giving participants concrete goals to perform in virtual environments.9
Selective fidelity is important in both defense and entertainment simulations, though it can be applied somewhat differently in each domain to reflect the importance given to different elements of the simulation. For DOD, selective fidelity is typically used to ensure realistic interactions between and performance of simulated entities, sometimes at the expense of visual fidelity. Hence a DOD simulation might have a radar system with performance that degrades in clouds and rain or an antitank round that inflicts damage consistent with the kind of armor on the target, but it might use relatively primitive images of tanks and airplanes if they are not central to the simulation. The entertainment industry tends to place greater emphasis on visual realism, attempting to make simulated objects look real, while relaxing the fidelity of motions and interactions. An entertainment simulation is more likely to use tanks that look real, but that do not behave exactly like real tanks: their motion may not slow when they travel through mud, or their armor may not be thinner in certain places than in others.
Such differences limit the ability of defense and entertainment systems to be used in both communities. For example, while many modern video games create seemingly realistic simulations, they do not necessarily model the real world accurately enough to meet defense requirements. Granted, there is a genre of video games that strive to be as realistic as
possible. Games like Back to Baghdad and EF2000 are popular in large part because they strive for high degrees of accuracy. In addition, because of the long lifetimes of some trainers at DOD, several modern video games far exceed the accuracy of some older operational simulators. But game designers must often break with reality in order to meet budgetary and technological constraints. As Scott Randolph of Spectrum HoloByte noted at the workshop,
There is always the tendency to do things like take an [infrared] sensor that is good for 10 miles and works best at night. But because you don't want to keep track of [the] time of day, you make it so it always works the same. It can't really see through a cloud or dust, but since there aren't very many clouds in the game and you don't want to keep track of dust, you very quickly end up with a sensor model that doesn't model the real sensor.
While the software in these applications may have the technical underpinnings to produce training devices, their primary goal is entertainment, not accuracy. Given the verification, validation, and accreditation requirements that must be met for DOD training applications and the profit expectations of the entertainment industry, it appears unlikely that a common software application could be written to meet the needs of both communities. This observation does not suggest that DOD and the entertainment industry cannot develop a common architecture or framework (such as network protocols and database formats) for simulation that both communities could use, as is described in the "Standards for Interoperability" section of this chapter.
DOD and the entertainment industry may be able to benefit from their complementary approaches to selective fidelity. The entertainment industry, with support from other industries, will continue to pursue techniques for enhancing visual fidelity. Much of the basic research to support such efforts is being conducted in universities. The National Science Foundation, for example, is funding a Science and Technology Center for Computer Graphics and Scientific Visualization that includes participants from the computer graphics programs at Brown University, the California Institute of Technology, Cornell University, the University of North Carolina at Chapel Hill, and the University of Utah. The center has a long-term research mission (11 years) to help improve the scientific bases for the next generation of computer graphics environments (both hardware and software). Its research focuses on modeling, rendering, interaction, and performance. In contrast, DOD may have a greater incentive to explore ways of incorporating scientific and engineering principles into its simulations to enable entities to behave and interact more realistically. Once developed, techniques for fidelity may be able to be
shared between the two communities. One committee member stated that many game developers visit his lab seeking physics-based models for vehicles in a simulated environment.
Networked simulation, which allows multiple participants connected to a common network (whether a local-area network or the Internet) to interact simultaneously with one another, is becoming increasingly important to both DOD and the entertainment industry. Both share a common need for adequate network infrastructure to support growing numbers of participants. DOD's goal is to develop a networked training environment in which military operations can be rehearsed with large numbers of participants while avoiding expenditures on fuel, machines, and travel. Participants can range into the thousands or tens of thousands and include soldiers at workstations with weapons-system-specific interfaces, soldiers at keyboards, or computer-generated forces that mimic human interaction. Such large-scale networked simulations are carefully planned and set up, just like actual military maneuvers with real equipment; they are coordinated with radios and include the full range of participants needed to support military operations.
For the games industry the goal of networked games is to provide a shared compelling entertainment experience for participants. Players in such networked systems are most often at the consoles of home computers or at location-based entertainment centers and are connected via local-area networks or the Internet. Internet-based games are an area of strong growth. Currently, Internet gaming supports multiplayer versions of existing computer games that have been modified to allow "Internetworking." Most use proprietary protocols to exchange information and can support interoperability among players using the same game. Though they are still at a simple stage, connecting only tens of players, such games are moving toward larger-scale connectivity. If the number of participants in networked games grows as large as DOD simulations (the targeted size of military simulations has increased by nearly two orders of magnitude over the past decade), new architectures may be required to keep the games from running so slowly that delays (or latency) become perceptible to players.
Both DOD and the entertainment industry anticipate large growth in the number of participants who engage simultaneously in networked
simulations. DOD has already demonstrated systems linking thousands of players and would like to link hundreds of thousands. Most networked games currently allow 8 to 32 players, but as the offerings expand, games could see hundreds or even thousands of networked participants.10 Such increases in scale pose a number of challenges that will need to be resolved. More participants implies an increase in the size and complexity of virtual worlds. Such requirements, combined with the increased amount of information that must be exchanged among participants, place additional demands on available bandwidth. Greater network traffic implies greater delays in delivering messages along the network unless improved means can be found of designing the network or distributing processing.
Overcoming Bandwidth Limitations
The growth in the number of participants in networked simulations and the desire to share greater amounts of information place increasing demands on bandwidth and computational power of simulation and game systems. Attempts to overcome bandwidth limitations have tended to concentrate on one of two areas: (1) increasing the bandwidth available for networked simulations and (2) minimizing the demand for bandwidth made by networked simulations. Workshop participants agreed that there would be some value in bringing these two communities together to exchange implementation ideas and techniques.
To overcome bandwidth limitations, both the defense modeling and simulation community and providers of Internet-based games have attempted to develop or acquire greater bandwidth for their systems. DOD has constructed its own network, the Defense Simulation Internet (DSI), to allow simulation systems at distant sites to engage each other. The system in effect provides DOD users with a dedicated wide-area network that is logically separated from the Internet and keeps DOD messages free from other traffic on the Internet. Local-area networks tied to DSI are connected via a T-1 line, which allows high-speed transfers of data. As such, DSI makes each participating site into a big local-area network.
Networked game companies have also attempted to separate their message traffic from that of the Internet to improve reliability and expand available bandwidth. Such companies have typically negotiated with Internet service providers to pay for premium service with certain guarantees of available bandwidth in order to reduce network latency.
Continued improvements in the bandwidth available for networked simulation will continue to derive from advances outside the defense modeling and simulation community or the entertainment industry. Tele-
communications companies and Internet service providers continue to upgrade the capacity of their networks and network connections to provide higher-speed access and greater bandwidth. DOD will continue to expand the capacity of its networks for command, control, communications, computing, and intelligence (C4I) and the Defense Simulation Internet. To the extent that DOD's C4I community becomes more closely linked with the commercial communications industry, some of the existing constraints on bandwidth may be relaxed.11
Other attempts at overcoming bandwidth limitations have focused on using existing bandwidth more efficiently and reducing the amount of bandwidth demanded by networked simulations. Networked game companies, cognizant that most players access the Internet via 14.4- or 28.8-kilobits-per-second modem connections, are striving to customize their network data to reduce data transmission requirements while maintaining the entertainment value of their applications. Military simulation designers have paid relatively little attention to determining which data transmissions can be dispensed with while retaining acceptable reality at the application level. The military could leverage game developers' expertise in determining what data reduction might be achievable through techniques developed by the entertainment industry. DOD and the entertainment industry could also become more involved in work that is under way internationally in the computer and consumer electronics industries to develop image compression technologies and standards. To date, neither DOD nor the entertainment community has been heavily involved in MPEG-4.
DOD and the Internet community have pursued another possible solution to the bandwidth problem: multicast routing systems that incorporate software-based area-of-interest managers (AOIMs) to direct packets of information across a network to particular groups of listeners.12 Such systems allow any member of a group to transmit messages (containing text, voice, video, and imagery) to all other members of the group via a single transmission. 13 This approach prevents the sender from having to transmit individual copies of the message to all intended recipients, freeing resources for other purposes. Machines that are not part of the group ignore the packet at the network interface, eliminating any need for the central processing unit to read the packet. Proposed partitioning schemes are based on spatial (geographic groupings based on locality), temporal (e.g., real-time versus nonreal-time), and functional (e.g., voice communications, aircraft) characteristics. AOIMs distribute partitioning algorithms among hosts rather than rely on a central AOIM server. Used in conjunction with multicast routing, they can help minimize the amount of bandwidth needed to support networked simulations.
Work on multicast routing and AOIMs has been ongoing for several years. The Naval Postgraduate School has incorporated multicast into its NPSNET for experiments across the Internet. DMSO has also invested in the development of AOIM filters as part of its High-level Architecture (HLA), and the HLA's Run-Time Infrastructure (RTI) will soon support a comprehensive AOIM capability.14 The Internet Engineering Task Force (IETF) has also initiated an effort to implement multicast mechanisms to support distributed simulation. Its Large Scale Multicast Applications working group has developed documentation to describe how IETF multicast protocols, conference management protocols, transport protocols, and multicast routing protocols can support large-scale distributed simulations, such as DOD simulations containing 10,000 simultaneous groups and upward of 100,000 virtual entities.15
Nevertheless, additional research is needed to expand the capabilities of AOIMs beyond those of simple filters and to make them generalizable across problem domains. The IETF, for example, has identified seven areas in which existing Internet protocols are insufficient to support large-scale distributed simulation networks (see Table 2.1). Among these is the need to develop multicast protocols that can provide the quality of service needed for distributed simulation: different types of messages must be transmitted with different degrees of reliability and latency. In addition, research is needed to (1) help define a network software architecture that properly uses AOIMs; (2) determine how best to program AOIMs and how to generalize the concept of AOIMs so that as applications change new AOIMs can be downloaded to match the application; and (3) implement and test AOIMs on a wide-area network basis.
Latency is a major barrier to fast-action Internet games and to large-scale simulations generally. To make participants feel as though the system is responding in real time, designers of fast-action games typically try to keep the delay time between the moment players instruct their simulators to take certain actions (e.g., fire a gun, change direction) and the time the system generates the appropriate response to 33 milliseconds or less. Maintaining such latencies across large distributed networks, such as the Internet, is difficult. Electrical and optical signals transmitted across such networks can experience several types of delays. The fundamental limitation is the speed of light: signals cannot travel roundtrip in fiber optic cable between New York and San Francisco, for example, in less than about 54 milliseconds. But signals encounter additional delays as they travel through large networks: modems must format messages for transmission over the network; routers must determine
TABLE 2.1 Additional Capabilities Identified by the Internet Engineering Task Force That Are Needed to Support Multicast in Distributed Interactive Simulations
Resource reservation in production systems
The capability to reserve a specified amount of network bandwidth for a given simulation to help ensure that certain messages can be transmitted with higher levels of service. The proposed Resource Reservation Protocol is one candidate for this need.
Routing protocol that determines the paths of packets through the network based on the relative congestion of different pathways and the quality-of-service demands of different message types.
Multicast capabilities that
Multicast capabilities need to be extended to different types of wide-area networks, such as those that use asynchronous transfer mode communications.
A set of transmission protocols that can provide the range of quality-of-service and latency requirements of distributed interactive simulations, such as best-effort multicast of most data, reliable multicast of critical reference data, and low-latency reliable unicast of data among arbitrary members of a multicast group.
Network management for distributed systems
A protocol for managing network resources, such as the Simple Network Management Protocol used on the Internet.
Session protocols to start, pause, and stop
Procedures and protocols that facilitate coordinated starting, stopping, and pausing of large distributed exercises.
Mechanisms to ensure the integrity, authenticity, and confidentiality of communications across the network.
SOURCE: Pullen, J.M., M. Myjak, and C. Bouwens. 1997. "Limitations of Internet Protocol Suite for Distributed Simulation in the Large Multicast Environment," a draft report of the Internet Engineering Task Force dated March 24; available on-line at ftp.ietf.org/internetdrafts/draft-ietf-lsma-limitations-01.txt.
how to send the messages through the network. Queuing delays that are due to congestion on large networks and packet losses (which require messages to be re-sent) can add significantly to the latency of distributed networks, especially public networks like the Internet that handle large volumes of traffic.
As a result, even premium services over the Internet cannot generally meet the latency requirements for fast-action simulations conducted across a widely distributed network. According to Will Harvey of Sandcastle Inc., network latency for roundtrip coast-to-coast transmissions across the Internet will not drop below 100 to 130 milliseconds by the year 2000 (Figure 2.1).17 Fast-action simulations do not operate realistically with such latencies. Participants trying to dodge bullets may become frustrated because the response time is too slow for them to dodge, or they may feel cheated because the program displays their character such that it appears they have dodged it when they have not.
Latencies are highly dependent on the architecture of distributed networks. In a lockstep architecture (Figure 2.2) each individual machine controls all objects locally and broadcasts changes to the other machines via a central server. If a player pushes a button to make his or her character jump, for example, the player's own machine will update the position of the character and send a message to the server indicating that the jump button was pressed. The server will broadcast a message to other players' machines that the first player pressed the jump button, and those player's machines will update the first player's position accordingly. The simulation advances one cycle when each machine has received a complete set of user input from all participating machines. Since advancing a cycle requires complete exchange of user input, the responsiveness of the system is limited by the latency of the slowest communication link and is contingent on the reliability of all nodes.
In a client/server architecture each machine independently sends its user input or action request to a central server, which then relays the information to each player's client machine. For example, if a player pushes the jump button, the client machine will send a message to the server indicating that the button has been pushed. The server will then update the position of the player's icon and send the updated position to all client machines participating in the game. Controlling an object from a client machine still entails a roundtrip delay (from client to server and back), but the responsiveness of any individual client machine is not affected by the communication speedor reliability problemsof the other machines.
In a fully distributed architecture, machines control objects locally and broadcast the results of actions to other machines, which receive the information with some time delay. If a player presses the jump button, his or her machine will update the position of his or her character and send the updated information, via the central server, to each of the other players engaged in the game. Each machine has immediate responsiveness controlling its own objects but must synchronize interactions between its own objects and other objects controlled by remote machines.
In the lockstep and client/server architectures, responsiveness is limited by the roundtrip communication latency to the server. In the distributed architecture, responsiveness is not limited by latencies because objects are controlled locally and the players' own objects respond to commands with little delay; however, this architecture creates problems
of synchronization. Players' own objects are displayed in near real time, but other players' objects are displayed with a delay equal to the latency of the system. While such synchronization problems are not of concern if interactions between objects are minimal, they can create problems with shared objects.
Attempts to resolve latency and synchronization problems can take several approaches. The first is to improve the speed of the underlying network. Work is under way to develop and deploy a new algorithm for queue management called Random Early Detection that will help minimize queuing delays across the Internet and other networks.18 Other research is investigating ways to speed the delivery of time-sensitive packets across the Internet by establishing different levels of service quality.19 For a price premium, users will be able to designate that their packets need to be delivered with minimal delay. Similar performance is available today through the use of dedicated networks that some Internet gaming companies and the DOD have built to support distributed simulations. Such networks avoid delays caused by congestion in public networks. Nevertheless, all such efforts in this area cannot reduce latencies below those imposed by the speed of light itself.
Other attempts at improving responsiveness recognize the latencies inherent in distributed systems and attempt to compensate for them by predicting the future location of objects, a technique called dead reckoning. This technique accommodates the delay with which information is received from other participants by predicting their actions using information such as position, velocity, and acceleration and by bringing all objects displayed on each simulator into the same time frame. Such techniques are effective only insofar as the motions or actions of objects are predictable and continuous; they cannot yet anticipate future changes in the course of objects (although future research may allow the development of more sophisticated algorithms that can anticipate deviations from continuous motions, perhaps by incorporating information about terrain features or past flight trajectories). Participants in simulated tank engagements have found ways to outsmart such dead-reckoning techniques: before passing in front of an enemy tank, they will accelerate quickly and then stop abruptly so that the enemy tank will incorrectly predict and display their position.
Alternatively, some researchers are developing techniques for synchronizing events across distributed simulations.20 Such approaches assume that information from remote simulators will always be received with a time delay and that many actions cannot be predicted accurately. Thus, they show objects from remote machines with an inherent time delay. If users have no interactions with remote objects, the time delay
does not interfere with the simulation and the system need not compensate for the time differential. But if objects from local and remote machines do interact, synchronization technologies compensate for the time difference. These technologies give users the impression that the network has zero latency, or immediate responsiveness, but they do so by sacrificing some degree of fidelity. For example, a 100-meter car race may actually be a 90-meter race with 10 meters of compensation interjected at the appropriate places to synchronize the outcome; similarly, a ball thrown from one player to another may travel faster when moving away from than toward the local player (Box 2.1). While having demonstrated some efficacy in game applications, additional analysis will be needed to determine the suitability of such techniques for defense simulations that require high levels of fidelity.
Standards for Interoperability
A related area of interest to DOD and the Internet games community is standards for interoperability. Interoperability is the ability of various simulation systems to work with each other in a meaningful and coherent fashion.21 It is often defined as a matter of degree: a simulation is considered compliant if it can send and receive messages to and from other simulators in accordance with an agreed-upon specification. Two or more simulators are considered compatible if they are compliant and their models and data transmissions support the realization of a common operational environment. They are interoperable if their performance characteristics support the fidelity required for the exercise and allow a fair contest between participants in which differences between individual simulators are overwhelmed by user actions. For example, a flight simulator with no damage assessment capability would not be interoperable with other simulators having this capabilityeven if they can communicate data effectivelybecause it could not detect that it had been hit by an enemy missile and destroyed.
Achieving interoperability between simulation systems requires (1) a common network software architecture with standard protocols that govern the exchange of information about the state of each of the participants in the simulation; (2) a common underlying architecture for maintaining information about the state of the environment related to a particular simulator; and (3) a common representation of the virtual environment.22 As the size and scale of defense simulations grow, participants will need a consistent view of the battlefield so that they can agree on the location of objects there and on the timing of events. Given that different players will have different ways of gathering information (e.g., radars and other sensors as well as information relayed from command
Consider a football game played by multiple players in a distributed network. It is important that the football be controlled locally on the machine of the player currently holding the ball so that latencies are minimized; otherwise, the ball will not move at the same time as the player's hand, and a lag may be apparent. If the football is passed to another player, it must then be controlled by the other player's machine, but passing the football is complicated by the fact that the positions of the players on the two computers may not be synchronized because of latencies in the network.
The solution to this problem rests on two observations. First, by migrating the football from one machine to another, the latency problem can be corrected. Second, two players do not need to see exactly the same thing on their two screens, as long as they agree on the outcome and no one feels cheated. So if one player throws another a football, the football can travel slightly slower from the first to the second player on one screen than on the other, giving control of the football a chance to migrate from one computer to the other midway through the path. Neither player can tell the difference.
To demonstrate the viability of this approach, Sandcastle Inc. developed an experimental Internet ping pong game in which two players hit a ball back and forth to each other, controlling individual paddles. Each player sees his or her own paddle in real time and the opponent's paddle with some time delay. Each player sees the ball travel away from his or her paddle at a slower speed than the ball travels toward his or her paddle, so that at the point at which the ball reaches the opponent's paddle, the player sees the ball and the opponent's paddle at a point in the past equal to the latency of the system. In experiments, participants could not detect a delay in the system, even with network latencies of two-thirds of a second (670 milliseconds).
SOURCE: Will Harvey, Sandcastle Inc.
and communications centers), they do not need to have the same information, but the simulation itself should not alter the information they receive. Meeting these requirements allows tank simulators, for example, to be designed and interconnected so that their operators can share information and train jointly in a common virtual battlefield.
Related to this capability is composability, the ability to build simulations using components designed for other simulations. Composability is a significant concern for DOD, which cannot construct a single inte-
grated simulation that serves all possible purposes and would like to minimize the development costs of new simulations. Using composable simulations, for example, a simulation designed to train aircrews and ground forces in conducting close air support operations could be built using simulated aircraft and simulated soldiers that were designed for other simulations. Ensuring this type of interoperability requires a common architecture for the design of simulations and a common understanding of the types of tasks conducted by the individual simulators and those conducted by the integrating system. While most existing simulations were designed for a particular purpose and may not be able to be combined into larger simulations, future systems may be able to be designed in ways that will allow greater interoperability and composability.
DOD Efforts in Interoperability
Over the past several years DOD has attempted to develop standards to promote interoperability among its simulation systems. Such efforts have needed to accommodate the heterogeneity and scale of entities modeled in the virtual battlefieldcombat aircraft, tanks, and ships, and refueling vehiclesand support the full generality of participant interactions. To this end the defense simulation community developed standards, such as Distributed Interactive Simulation (DIS) standards, that aim primarily at achieving "plug-and-play" interoperability among simulators developed by independent manufacturers.
DIS is a group of standards developed by members of the defense modeling and simulation community (both industry and university researchers) to facilitate distributed interactive simulations. They can be used for hosting peer-to-peer multiuser simulations in which objects (typically vehicles) move independently, shoot weapons at each other, and perform standard logistics operations such as resupply and refueling. The DIS protocols include a variety of industry and military standards for computer equipment and software, as well as the Transmission Control Protocol/Internet Protocol (TCP/IP) networking protocols used over the Internet. Specific protocols have had to be devised to define the communications architecture for distributed simulations as well as the format and content of information exchanges, the types of information relevant to entities (such as tanks, aircraft, and command posts) and the types of interactions possible between them, simulation management, performance measures, radio communications, emissions, field instrumentation, security, database formats, fidelity, exercise control, and feedback.23 DIS has a well-developed simulation management subprotocol for setting up and controlling individual players in an exercise. Consequently, DIS can achieve certain levels of data interchange
for one-on-one and unit-level interactions but cannot support aggregation and disaggregation of units.
The DIS standard is derived from protocols developed for DOD's Simulation Network (SIMNET) system, adopting its general principles, terminology, and protocol data unit (PDU) formats for transmitting information between simulators.24 The initial set of protocols (DIS 1.0) was accepted by the standards board of the Institute of Electrical and Electronics Engineers (IEEE) in 1993 and is now codified under IEEE standard 1278-1993. These protocols were subsequently recognized by the American National Standards Institute.
In addition, DOD has been pursuing development of the High-level Architecture (HLA) to facilitate both interoperability and composability. HLA is a software architecture that defines the division of labor between simulators and a layer of support software, called the Run-time Infrastructure (RTI), that facilitates interoperability. It consists of specifications, interfaces, and standards for a broad range of simulations, from combat simulations to engineering analyses. Groups of people wanting to establish interoperability among their simulators via the RTI do so by creating a federation. Federation members make their own decisions about the types of entities that will be included in simulations and the types of information they will exchange. Individual simulators post state-change information to the RTI and receive state-change information from other simulators via the RTI. The RTI makes sure all parties to a federation receive state-change data and ensures that the federation's data are time synchronized and routed efficiently to the other simulations in the federation.
Development of the HLA is being managed by the Architecture Management Group, which is chaired by DMSO and has representatives from all military services and DOD agencies developing advanced modeling and simulation systems. Members represent a wide range of military applications, from training and military operations to analysis, test and evaluation, and engineering-level models for system acquisition and production. Development of the HLA was initiated in 1994, when DARPA awarded three contracts for the definition of a high-level architecture for advanced distributed simulations. Contractors analyzed the needs of prototype federations in four areas: (1) platform simulators (such as an M-1 tank simulator and Close Combat Tactical Trainer; (2) joint training (simulation-like war games that occur in accelerated time); (3) analytical simulations of systems to support joint theater-level war-fighting and support activities; and (4) engineering federations to design, test, and evaluate new military systems. The final briefings from these contractor teams were received in January 1995, and a core team of individuals synthesized the inputs, with additional insight from other ongoing DOD modeling and simulation programs to arrive at the initial definition of
the HLA.25 Testing of prototype federations was completed in July 1996 to determine if the RTI was broadly enough defined to be useful across a wide range of federations. Test results informed development of the HLA Baseline Definition, which was completed in August 1996 and approved by the Under Secretary of Defense for Acquisition and Technology as the standard technical architecture for all DOD simulations in September 1996. All simulations developed after October 1, 1999, must comply with HLA; no existing simulations that are not compliant with HLA may be used after October 1, 2001, unless they are converted to the standard.26
The development of a standard for distributing 3D computer graphics and simulations over the Internet has taken the quick path from idea to reality. In 1994 Mark Pesce, Tony Parisi, and Gavin Bell combined their efforts to start the VRML effort. Their intention was to create a standard that would enable artists and designers to deliver a new kind of content to the browsable Internet.
In mid-1995 VRML version 1.0 emerged as the first attempt at this standard. After an open Internet vote, VRML 1.0 was to be based on Silicon Graphics Inc.'s (SGI) popular Open Inventor technology. VRML was widely evaluated as unique and progressive but still not useable. At this point, broad industry support for VRML was coalescing in an effort to kick-start a new industry. Complementary efforts were also under way to deliver both audio and video over the Internet. The general feeling was that soon the broad acceptance of distributed multimedia on the Internet was a real possibility and that VRML would emerge as the 3D standard.
After completion of the VRML 1.0 standard, the VRML Architecture Group (VAG) was established at the Association for Computing Machinery's Special Interest Group on Computer Graphics (SIGGRAPH) annual conference in 1995. It consisted of eight Internet and 3D simulation experts. In early 1996 VAG issued a request for proposals on the second round of VRML development. The call was answered by six industry leaders. The selection of the VRML 2.0 standard was made via open voting and occurred in a short time frame of about two weeks. SGI emerged as the winner with its "Moving Worlds" proposal. By this time over 100 companies had publicly endorsed VRML, and many of them were working on core technologies, browsers, authoring tools, and content. At SIGGRAPH '96, VAG issued the final VRML 2.0 specification and made a number of other significant announcements.
(continued on next page)
DOD hopes that the HLA will be adopted outside the defense modeling and simulation community. To facilitate this process, RTI software will be made freely available as a starter kit.27 The software will be in the public domain; initial release will be for Sun workstations, with later releases for Silicon Graphics, Hewlett-Packard, and the Windows NT platforms. Adoption of HLA beyond DOD is questionable, however; members of the entertainment industry noted at the workshop that HLA's development took place largely without their inputor that of other non-defense communities. As a result, many representatives of the games community believe that HLA will not meet their needs; at the workshop
(continued from previous page)
To help maintain VRML as a standard, VAG made several concrete moves. First, it started the process of creating the VRML Consortium, a not-for-profit organization devoted to VRML standard development, conformance, and education. Second, VAG announced that the International Organization for Standardization would adopt VRML and the consensus-based standardization process as its starting place for an international 3D metafile format.
Like VRML, DIS standards were generated in an open process via the biannual Workshop on Standards for the Interoperability of Distributed Simulations. Though the first version of DIS derived largely from standards developed by BBN Corporation for the SIMNET program, participation in a revision of the standards was greatly expanded. The first workshop was held in Orlando, Florida, in September 1989 and defined the shape of the DIS standards as they progressed from protocols for DOD's Simulation Networking (SIMNET) program to the version 2.1.4 standards that exist today. Attendance at the workshop grew from 150 people in 1989 to more than 1,500 in September 1996.
In contrast, the HLA design and prototype implementations were developed by a small group of DOD officials and contractors in a more closed fashion that did not solicit input from the modeling and simulation community at largedespite an interest in promulgating the standard broadly. Doing so speeded development of the standard, but significant concerns were expressed that requirements of the broader community were being left out in the rush to completion. Few members of the Communications Architecture Group of the DIS workshop participated in HLA's development, nor did representatives from industries outside defense.
SOURCES: Position paper prepared for this
project by Brian Blau; see Appendix D. Also, Lantham, Roy. 1996.
"DIS Workshop in Transition to . . . What?", Real Time
many indicated that they were not even aware that the standard was being developed.28 Further, the process used to develop HLA may have offended those who were not involved in the development process.29 HLA was not developed in as open a manner as DIS standards and standards for the Internet community, such as the Virtual Reality Modeling Language (VRML) (Box 2.2).
Interoperability in the Entertainment Industry
The entertainment industry, to date, has expressed different interests regarding interoperability standards. While DOD has a strong interest in ensuring that various simulation systems can work together, the enter-
VRML is an evolving standard used to extend the World Wide Web to three dimensions. It is a widely accepted commercial standard, one that people heavily involved with the Internet are seriously thinking about and adopting for their software development. The current version, VRML 2.0, is based on an extended subset of the Silicon Graphics Inc.'s (SGI) Open Inventor scene description language. It has both a static component and an interactive component. The static component features geometry, textures, 3D sound, and animation. The interactive component contains a flexible programmability that can be added to a VRML file through the use of Java code, another Internet technology. This addition of Java allows not only graphics data be exchanged with VRML but also behaviors. VRML is both comprehensive and unfinished, with its current draft exceeding several hundred pages.
Complex issues surrounding real-time animation in VRML 2.0 include entity behaviors, user-entity interaction, and entity coordination. Many factors are involved. To scale to many simultaneous users, peer-to-peer interactions are necessary in addition to client-server query and response. An approved specification for internal and external behaviors is nearly complete. VRML 2.0 will provide local and remote hooks (i.e., an applications programming interface, or API) to graphical scene descriptions. Dynamic scene changes will be stimulated by any combination of scripted actions, message passing, user commands or behavior protocol (such DIS or Java). Thus, the forthcoming VRML behaviors standardization will simultaneously provide simplicity, security, scalability, generality, and open extensions.
VRML is not yet complete, and there is still ongoing work. Three specific areas being addressed include definition of an external API, definition of
(continued on next page)
tainment industry places strong emphasis on developing proprietary systems and standards that preclude interoperability. Mattel, for instance, encrypted data output from its Power Glove input device so that it could not be used with competitors' game devices. Nintendo and Sega game machines cannot interoperate with each other or with computer-based video games. A flight simulator game produced by Spectrum HoloByte Inc. cannot be used in a mock fight against Strategic Simulation Inc.'s Back to Baghdad game. Even within the multiplayer networked games, each player uses the same game program. Commercial standards have therefore not sought interoperability between independent systems, but have attempted to allow independently produced software titles to integrate with the same user front-end software (such as operating systems,
(continued from previous page)
a default browser scripting language, and compression technology. The purpose of the external API is to provide a way for Web developers to write programs that are able to drive a VRML simulator or a VRML browser. A call has been issued requesting proposals for this external API, and the VRML community expects to have that finalized in 1997. The purpose of the default scripting language is to have a standard language that can be embedded in a VRML file, a language understood by every VRML browser. A call has been requested for proposals for this language.
The purpose of the compression technology call is to be able to compress VRML data, hence minimizing bandwidth requirements on the Internet. A particular response to this call is notable in that it is a joint proposal between IBM and Apple Computer. The proposal is for the binary compression of VRML files and is significant because IBM and Apple have decided to open up their patents on geometry compression, providing them free to the Internet. IBM and Apple are providing royalty-free licenses to VRML developers for this compression. This is a significant step in the types of collaboration the Internet environment seems to be bringing out in the commercial world.
The VRML community is taking steps to ensure
widespread adoption. The major step in this direction is that a
VRML consortium is being formed as a permanent fixture of the
Internet community. Additionally, the International Organization
for Standardization (ISO) has picked up VRML as the 3D metafile
standard, a selection it has been working on for quite a few years.
By selecting VRML, ISO changed the way it normally does business.
For the 3D metafile standard, ISO put aside its normal process of
building its own standard. Instead it opted to adopt VRML as it is
today in order to finish quickly. The entire ISO standards process
for VRML is expected to be completed within just 14 months, cutting
literally 75 percent of the time that it normally takes to create a
Web browsers, or graphics libraries) so that players with different computer systems can play each other. Standards such as VRML 2.0, OpenGL, and DirectX are aimed in this direction (Box 2.3). As a result of these standards, a user can use the same software to run a variety of game applications.
Growing interest in networked simulation suggests that research into standards and architectures for interoperability will continue to be needed. Areas of particular interest include protocols for networking virtual environments, architectures to support interoperability, and interoperability standards.
Virtual Reality Transfer Protocol
Researchers in advanced browser and networked game technologies are beginning to have similar concerns regarding interoperability. The desire from the browser side is for interactive 3D capabilities, but the VRML standard does not have the ability to support peer-to-peer communicationsthe type of communications required for networked interactive gaming. Browsers allow heterogeneous software architectures to interoperate, thanks to the http standard.
Additional work is ongoing toward the development of a standard for networked virtual environments. A 1995 National Research Council report identified networking as one of the critical bottlenecks preventing the creation of large-scale virtual environments (LSVEs) as ubiquitously as home pages on the Web.30 Numerous important component technologies for LSVEs have been developed, but that work is not yet complete, and significant research and integration work remains.31 Some of the integration work is to merge the ideas from the LSVE research community with work from the Internet/World Wide Web community. The integration effort discussions are occurring under various titlesthe virtual reality transfer protocol and the IETF's large-scale multicast applications working group.32
Among the component technologies that have enabled rapid exponential growth of the Web are HTML and http. HTML is a standardized page markup language that allows the placement of text, video, audio, and graphics in a nonplatform-dependent fashion. HTML is being extended into four dimensions with VRML via the addition of 3D geometry and temporal behaviors. http is the hypertext transfer protocol, an applications layer protocol used to serve HTML pages and other information across the Internet. http binds together several dissimilar protocols in-
cluding the ftp, telnet, gopher, and mailto protocols; hence, it is an integration or metaprotocol. To support LSVEs across the Internet, it is expected that a continuum of dissimilar protocols will have to be integrated into a single metaprotocol, an applications-layer protocol called the virtual reality transfer protocol (vrtp).33 At a minimum, vrtp will combine http (for URL service), peer-to-peer communications (such as DIS and its relatives), multicast streaming (for audio/video streams), Java agents, heavyweight object server protocols (such as CORBA and ActiveX), and network monitoring.
With vrtp and by using the existing Internet infrastructure, Web-scalable LSVEs will be constructible. VRTP will be an open architecture that will use standards-based public-domain software and reusable implementations, all taking advantage of the sustained exponential growth of "Internetworked" global information. The vrtp project is ongoing and taking a similar standards approach to that used for VRML, an open, public, Web-based forum for technical discussion and adoption.
Architectures for Interoperability
Outside the VRML/vrtp communities, there is little academic research on solving the network software architecture interoperability problem. Nevertheless, the benefits of additional research could be large, and there are many unsolved research problems. The key to solving the network software architecture problem is to understand that doing so in a scalable way will require attention not just to networking issues but also to networking and software architecture issues. This means that one cannot just design an applications-layer protocol to establish message formats. Attempts to design distributed simulations must consider the scarcity of available network bandwidth and processor cycles and must attempt to minimize latencies across the network. Brute-force methods involving large computers can do simple aggregation to help minimize bandwidth, but such methods are expensive and limited. Distributed solutions that work with both network bandwidth and processor cycle limitations are inherently more scalable.34 Real-time simulation requires high-performance systems and low communications latency. Adding more layers of abstraction and protocol, as is common for achieving interoperability, can work against the need to meet the latency and performance requirements of networked simulations. Research into application-level protocols and architectures must take these considerations into account.
From the networked games perspective, heterogeneity in software architecture is not yet possible, although some research is being done. The desire to build large-scale gaming, on the order of thousands of players in
the same virtual world, may require game companies to move toward more malleable standardized protocols. Game companies will have to decide how to define their applications-layer protocols, again the same thing that everyone in the large-scale virtual world must be able to do. One proposal being considered is for the creation of an applications-layer protocol called GameScript. The purpose behind GameScript is to define a standard applications-layer protocol that allows games designed by different manufacturers to communicate. One of the most often described uses for GameScript is that players in one game see players in neighboring games at the boundaries of their virtual worlds. This serves as a teaser for game players to deposit money and try the next game. Several telephone companies and small venture-capital-funded start-ups are working on this, but no new products have been announced yet. GameScript work is similar to HLA and to the virtual reality transfer protocol work.
Common interoperability standards could have benefits to both DOD and the entertainment industry, enabling them to solve common problems with common solutions. At present, there is no consensus in the games industry on the desirability of a common set of interoperability standards. While some game developers see common standards as a means of facilitating attempts to move networked games onto the Internet, many do not yet consider common standards a high priority. According to Warren Katz of MäK Technologies, resistance to common interoperability standards is generally based on four factors:
• Technical considerationsCommon standards tend to be designed to accommodate a wide range of potential uses and therefore are not optimal for any particular use. Given existing limitations in bandwidth for Internet-based games (most potential users connect to the Internet via modems that communicate at 14.4 or 28.8 kilobits per second), many game companies prefer to design custom protocols that maximize performance.
• Not-invented-here syndromeMany commercial firms have a bias against technology developed outside their own organization. Engineers in many companies believe they can develop better protocols that will provide a more elegant solution to a problem or that will speed processing times considerable. Acceptance of an existing solution implies that they are incapable of doing better.
• Strategic value of proprietary solutionProprietary networking protocols are viewed as a strategic competitive advantage. Use of a public standard would eliminate one element of advantage by allowing compet-
itors to use the same technology. In addition, use of a public standard could signal that a company is unable to develop a better solution.
• ControlAdoption of an industry or public standard reduces the control a company has over its protocols. Standards committees determine changes to the protocol. Companies that control their own protocols can upgrade them at their own pace, as the need arises.
These factors have, to date, stymied the use of DIS standards in the entertainment industry. According to Warren Katz, many video game companies have examined DIS protocols for suitability in their games, but few have implemented them. Some companies have found DIS protocols to be too big and complex, performing operations that were not relevant to video games and slowing the performance of the system.35 Others view DIS as a standard for military applications and do not consider it appropriate for nonmilitary games. As a result, the DIS protocols have not been embraced in an unmodified form by any game company. Several game companies have developed protocols derived from DIS that include only those functions needed to support their applications. Each of these implementations is proprietary to the developing company and not interoperable with other companies' protocols.
Many companies developing networked games attempt to use their proprietary protocols to their competitive advantage. Proprietary solutions serve as a means of differentiating one company's products from another's and, possibly, of generating revenues through licensing. Should one company develop a set of standards that is perceived as superior in some critical respects (whether allowing greater numbers of participants or by offering better games), it can make its games more attractive to game players. Furthermore, other companies might want to develop games that interoperate with that standard. The standard developer can charge a licensing fee that generates revenues in addition to whatever revenues it makes from selling its games. Use of a common or public standard does not provide such opportunities.
Not all companies will necessarily follow this business model. Some game companies might, for example, see their competitive advantage in the content of their games, not necessarily their networking capabilities. They might attempt to develop a protocol that they license freely in the hope that it will become widely adopted and facilitate growth in the market for networked games, leading to more sales of their titles. Adoption of this model will require a shift in the current business model of most video game companies. Some impetus for this shift may follow from growth in the market for multiplayer Internet games, but none has yet been seen in the games market.
As the Internet games industry matures, it is possible that common
standards for interoperability, such as HLA, could become of interest to game developers as a means of allowing them to reuse portions of one simulation to populate another. Anita Jones, DOD's director of defense research and engineering through May 1997, suggested at the workshop that as games become more complex, incorporating more players and a greater number of possible types of interactions between and among players, game developers may shift to a new product development strategy in which they construct games from components already developed for other simulations. Gilman Louie, of Spectrum-HoloByte, suggested that Internet-based games may encourage such change. Much like television or cable, Internet channels will need a great deal of programming with frequent updates and new activities so that users can experience something new to do every time they log on. The code base will need to be designed to allow easy upgrades so that designers can replace particular objects without completely redesigning the system.36
Others at the workshop suggested that HLA might be advantageous in allowing real-time simulations to interoperate with simulations that progress faster or slower than real time. In the video games world this capability would mean that a high-level strategy game that typically runs faster than real time could interoperate with a real-time simulator: a turn-based chess game could interoperate with a simulation of the motion of the pieces. Though these capabilities could be very useful in a game environment, it would be unreasonable to assume that any games company would adopt HLA without a strong outside influence. The more likely scenario, according to Warren Katz, is that a game company will adapt desired features of HLA within its own proprietary protocol.
One of the major challenges in creating a useful simulation system is populating simulated environments with intelligent characters and groups of characters. While some or even many of the entities present in simulated worlds may be controlled by human operators who are networked into the simulation, many are likely to be operated by the computer itself. Such computer-generated characters are a critical element of both defense and entertainment systems and serve a wide range of functions. Computer-generated characters may serve as an adversary against which a game player or user of a training system competes, such as an opponent in a computerized chess game or an enemy aircraft in a flight simulator. At other times they may serve as collaborators that guide participants through a virtual world or serve as a crew member. In large networked simulations, computer-generated characters may control the actions of elements for which human controllers are unavailable, whether a rival tank, a wingman, or a copilot.
Computer-generated Characters in Entertainment
Virtually all sectors of the entertainment industry are interested in computer-generated character technologies as a means of creating more believable experiences for participants and allowing greater automation of services where possible. Companies in the video games, virtual reality, and filmmaking sectors are developing or have deployed products that incorporate computer-generated characters. Since the creation of compelling virtual characters holds such allure, whoever can develop compelling complex characters in a rich fantasy world available through an accessible medium at an affordable price stands to profit handsomely. It is therefore likely that any company that devises a successful approach to solving this problem will be disinclined to share the enabling technologies in order to protect its competitive advantage.
Almost every genre of computer games, whether it be sports, action, strategy, or simulation, depends on computer-generated opponents. The ability of a game to attract and entertain players is directly linked to the quality of the computer-generated competitors in a game. All of the best-selling PC games (Chessmaster, Madden Football, Command and Conquer, Grand Prix II, Civilization, Balance of Power, Falcon 3.0, and EF2000) feature computer-generated opponents that challenge users. Gilman Louie of Spectrum HoloByte estimates that three of the four years required to produce a new video game are dedicated to developing algorithms for controlling computer-generated forces.
Increasingly capable computer-generated opponents have been incorporated into video games since the first commercial video game was introduced in 1970. Nutting and Associates' Computer Space allowed users to control a rocket that was pitted against two computer-controlled flying saucers. The player avoided the flying saucer's missiles while trying to steer its missile into one of the saucers. The flying saucers were controlled by a simple random function. After a few short months, players were able to quickly master the game and earn bonus time; soon after, players became bored with the game and stopped playing. Learning from this experience, the designers of the next major video game, Pong, replaced computer-generated agents with a second joystick so that players played against each other rather than a computer. In the early 1980s, Atari Games created what many considered to be the first credible strategic military simulation on a personal computer, Eastern Front. Players confronted a computer-based challenger that relied on a simple but effective rule-based system. The system used a series of "if-then" statements
to determine the computer's response to a player's move. Continued advances in computer technology and computer-generated forces enabled the creation of more sophisticated agents in games such as Harpoon and Aegis, as well as Panzer General. PC-based games further pushed the development of high-quality computer-generated opponents. Unlike most dedicated video games (with game boxes, such as those manufactured by Nintendo, Sega, and Sony), which are designed for simultaneous play by two to four players, PCs are generally used by one person at a time and games are designed for individual play. PC games therefore demand development of effective computer-generated forces, and PCs typically have microprocessors with sufficient power to support more complex computer-generated forces.
Future trends in video games will heighten the need for computer-generated forces and characters. Many multiplayer and on-line games will feature persistent universes that continue to exist and evolve even when a particular player is not engaged in the game. These games will take place over a significant span of time, at times without a definitive end. They will be similar to multiple-user domains in that users can come in to the game and exit at will. Persistent universes are critical to effective on-line games because they negate the need to coordinate large number of players and schedule games; in a persistent universe the game is always being played. Such games also get around the need to play an entire campaign or game in one sitting, enabling players to enter and exit as their schedules permit. At the same time, such games pose several problems: (1) the movement of players in and out of the game may be disorienting for the other players and destroys the continuity of the game;37 (2) game masters will not be able to ensure that enough real players are available at any given time to make the game enjoyable for participants; and (3) it may be hard to generate an environment that gives players a large enough role and provides the necessary rewards to keep them coming back. The solution to some of these problems is to create an environment that is fully populated by computer-controlled forces: either automated forces that are entirely controlled by the computer or semiautomated forces that are given high-level instructions by a real player but are then controlled by the computer. When players enter the environment, they will be able to replace one of the automated elements until they log off; then the computer will regain control. Players may also give general orders before logging off to keep units on a strategic or tactical direction until they return. The entertainment industry has begun to make progress in this area. A handful of real-time games, such as Command and Conquer, allow the player to control game pieces at a high level, giving the player instructions as to where to move or what action to take
but not specifying the details of the process. Additional work will enable the broader use of such techniques.
Computer-generated character technologies also appear in immersive virtual reality attractions. Walt Disney Company's Aladdin attraction, for example, puts the participant into a virtual environment with many simulated individuals performing throughout the virtual world.38 Characters such as shopkeepers, camel herders, the sultan's guards, and others populate the simulated city of Agrabah. Participants experience this attraction through head-mounted displays and spatialized headphones. They sit on a saddle and hold a section of Persian carpet that serves as their controller and enables them to fly their magic carpet around the city. As they explore the city, they encounter characters who go about their own business but react to the presence of the participant. Behaviors are programmed into these characters and others. All characters react to the guest's presence; some of them have some intelligence designed into them, and they attempt to provide useful information to the guests, based on their current circumstances.
In other entertainment media, books, film, and television, there are extensive examples of intelligent agents in synthetic worlds. In Warner Brother's 1994 film Disclosure, an angelic avatar assists the protagonist in searching a virtual reality library representing the computer's file system. In the current television series, Star Trek Voyager, the ship's physician is a holographic projection with intelligence and a database compiled from the expert knowledge of thousands of other physicians.
One of the standards of persistent virtual worlds and characters is the 1992 book Snowcrash by Neal Stephenson. Snowcrash defines a persistent virtual world known as the Metaverse that is mostly populated by real people who go there by donning an avatar that represents them in that space. There are also totally synthetic characters, of greater or lesser capability and complexity, who interact with real characters in the Metaverse as if they were simply avatars for real people. Today, these complex human-mimicking synthetic characters are simply science fiction. The challenge is to make them a reality for worlds of entertainment as well as worlds for training.
The film industry has also expressed interest in computer-generated characters and digital actors of various kinds. Digital effects studios are now creating three-dimensional digital data sets of actors, such as Tom
Cruise, Denzel Washington, and Sylvester Stallone.39 Work in digital actors is attempting to generate digitized versions of real actors that can be used in making films. While one motivation for developing digital actors is to avoid the high costs of hiring real actors, the true motivations include the ability to create additional scenes, if necessary, after a real actor has finished a film and to perform actions that real actors cannot or will not do, such as dangerous stunts.
Digital actor technology can also allow directors to create new characters that combine elements of existing actorsor to resurrect dead stars. GTE Interactive, for example, recently unveiled a digitized likeness of Marilyn Monroecreated by Nadia Magnenat-Thalmannthat can chat with visitors on the World Wide Web and respond to typed questions with speech and facial expressions.40 A short film starring the computer-generated Marilyn, Rendezvous à Montreal, also has been created. Digital actors are seen as a key element of interactive media.41 Interest has become large enough to have spawned a conference on virtual humans in June 1996; a second conference was held in June 1997. The goal of such efforts is not just to incorporate existing two-dimensional video images into film (e.g., as was done in Forrest Gump) but to allow films and other sorts of entertainment to be based around digital characters that act and perform various roles.
DOD Applications of Computer-generated Characters
DOD has a strong interest in what it calls computer-generated forces (CGFs). CGFs fall into two categories: (1) semiautomated forces (SAFs) that require some direct human involvement to make tactical decisions and to control the activities of the aggregated force and (2) automated forces, which are completely controlled by the computer. Both kinds of computer-generated forces are under development now for military systems and will find extensive application in modeling and simulation as the technology continues to mature.
DOD systems use CGFs for a variety of applications. In systems designed to train individual war fighters, autonomous forces are used to create adversaries for trainees to engage in simulated battles. In large networked training simulations, computer-generated characters are widely used to control opposing forces since it is too expensive to have all the opposing forces controlled by experts in foreign force doctrine. For such scenarios the military has also pioneered the development of SAFs, which are aggregated forces (such as tank platoons, army brigades, or fighter wings, as opposed to individual tanks, soldiers, and aircraft) that require human control at a high level of abstraction. Rather than controlling specific actions of individual elements of the unit, the SAF operator provides strategic direction, such as moving the unit across the river or fly-
ing close air support. The computer directs the forces to carry out the command. Hence, SAFs provide a way for training high-ranking commanders who must control the actions of thousands of soldiers on the battlefield. To date, most training tools have been directed toward standalone training simulators and part-task trainers that train individuals and systems such as SIMNET that train small groups of soldiers to work together on the battlefield.
There is now a diverse and active interest throughout the DOD modeling and simulation community in the development of computer-generated forces. DARPA is sponsoring the development of modular semiautomated forces for the Synthetic Theater of War program, which includes both intelligent forces and command forces. This effort also involves development of a command-and-control simulation interface language. It is designed for communications between and among simulated command entities, small units, and virtual platforms. The military services, specifically the Army's Close Combat Tactical Trainer program, are now developing opposing forces and blue forces (friendly forces) to be completed in 1997. The British Ministry of Defence also is developing similar capabilities using command agent technology in a program called Command Agent Support for Unit Movement Facility. DOD has several programs to improve the realism of its automated forces, such as DARPA's Intelligent, Imaginative, Innovative, Interactive What If Simulation System for Advanced Research and Development (I4WISSARD). Academic and industrial interest in this technology led to the First International Conference on Autonomous Agents in February 1997 in Marina del Rey, California.
Common Research Challenges
The challenge for today's researchers is to develop computer-generated characters that model human behavior in activities such as flying a fighter aircraft, driving a tank, or commanding a battalion such that participants cannot tell the difference between a human-controlled force and a computer-controlled force.42 Doing so can help prevent participants from looking for and taking advantage of the gaps in a logic routine instead of developing skills that can be applied with or against other human participants (an old pilot's adage is, "if you ain't cheating, you ain't trying!"). In his 1996 chess match with the IBM computer Deep Blue, for example, Gary Kasparov learned the computer's tendencies during his first losing game and exploited those tendenciesand the computer's lack of adaptabilityto come back and win the tournament. He was less successful in his 1997 match because Deep Blue had been programmed to
develop strategies like a human and could conceive of and execute moves unanticipated by Kasparov.43
The key is to find ways to implement computer-generated characters that behave convincingly like human participants for extended periods of time. Doing so requires research to develop agents that (1) can adapt their behaviors in response to changes in the behavior of human participants, (2) accurately model the behavior of individual entities (as opposed to aggregated units), and (3) can be easily aggregated and disaggregated.
A significant problem facing the development of automated forces is that humans learn and adapt faster than most existing computer algorithms. To date, games have been short enough that computer-generated forces based on specified scripts or simple "if-then" rules could provide enough of a challenge for most players, but with persistent universes simple rule-based systems will not be good enough to control automated forces. SAFs used in DOD systems are typically based on aggregated behaviors of tank or aircraft crews. Operations are extremely regimented; tank elements, for instance, operate on the basis of the Army's combat instruction sets, which were relatively easy to codify. DOD is also attempting to create automated SAFs with behaviors that can adapt. Efforts in these areas are resulting in a shift from SAFs that are algorithm based to ones based on artificial intelligence (AI). The first experiment was with a system called SOAR, developed by Carnegie Mellon University, the University of Southern California, and the University of Michigan.
Most such work is concentrated on knowledge acquisition by the SAF using Al techniques such as expert systems and case-based reasoning. Expert systems are developed by interviewing experts to discover the rules of thumb used to solve problems and then putting that information into the software in a way the computer can use. Case-based reasoning represents an alternative approach in which SAFs learn from their own successes and failures in simulations. The SAF acquires knowledge from new experiences in real time and adds it to its knowledge base for later use. The knowledge base consists of many cases that the computer uses in real time to match the scenario it is facing and respond appropriately. This technique is the basis for machine learning in DOD's I4WISSARD program. Currently, it is done manually during an after-action review, but ideally the computer would do it automatically. For example, a pilot could train the computer by dogfighting with it. If the pilot performed a maneuver the computer had never seen, it would add
the trick to its knowledge base and respond appropriately. This could provide a richer training experience for the pilot and make the simulator training useful longer.44
Additional experimentation is under way using complex adaptive system techniques to generate new behaviors. As part of a DARPA program, McDonnell Douglas has developed a system that uses genetic algorithms45 to develop new behaviors and tactics for military simulations. In previous trials, tactics developed by the system were used in simulators and shown to be effective for military operations.
Modeling Individual Behaviors
A related challenge is developing computer-generated characters that mimic individual behavior rather than group behavior.46 Group behavior can often be modeled using statistical modeling and rule-based decision processes. The goal is to develop group actions that seem reasonable, but reasonable is an easier test than human-like. Did that automated battalion move its tanks in a reasonable and believable fashion? Did the group react using appropriate doctrine? It is easier to hide errors in decision making with a large group of units than with a computer-generated individual. It is more difficult to model and create automated individuals that users can communicate and react to that is believably human. Within DOD, SAF-level simulation has typically gone as low as an entity like a tank or aircraft. Individual soldiers or crew members have not generally been represented, though work is ongoing in this area. As DOD explores more opportunities for training individual soldiers (dismounted infantry), the need to have realistic, intelligent simulated individuals becomes acute.
Research is also needed to develop ways of creating realistic digitized humans that look, move, and express emotion like their real counterparts. DOD has increased its efforts in simulating and modeling individual soldiers in synthetic environments through its Individual Combatant Simulation objective. A joint effort is under way between the U.S. Army's Simulation, Training, and Instrumentation Command (STRICOM) and the Army Research Laboratory. One of the key advances required for developing low-cost solutions to this problem are technologies for visualization of human articulation in real-time networked environments. The entertainment industry is building on motion-capture techniques pioneered by DOD for use in developing games and creating special effects. These techniques track the movements of various joints
and extremities as real characters perform a given set of tasks. The data can then be used to create more realistic synthetic effects.
Supporting natural language voice communications also is important. During networked play, voice support is critical for coordinated activities. The computer-generated character must interpret the voice input, react, and acknowledge the user using an equally natural voice. The voice must also indicate stress and the emotional state of the computer-generated character. The user must care about the automated forces and arrive at the same conclusions whether or not the forces under control are human or computer generated. There are many situations in which a player would move a unit into a suicidal situation with a computer-generated character but would choose not to if the force was human.
Aggregation and Disaggregation
Additional research is also needed in the area of aggregation and disaggregation. In large simulations, inactive individual units, such as tanks or soldiers, are often aggregated into higher-level units, such as tank battalions or platoons, to minimize the number of elements the system must track and to allow higher-level control of operations (such as by a field commander). In doing so, information about the individual elements is lost, as only average values of mobility or capability are retained for the aggregated unit. Thus, when the aggregated unit becomes active, the individual elements cannot be disaggregated into their original form. Each tank, for example, will be assigned the average mobility and firepower capabilities of the entire grouping rather than capabilities consistent with those it had before aggregation. Such inconsistencies not only limit the fidelity of the simulation but also generate incongruities among the representations of a simulation perceived by different participants (such as the field commander who sees aggregated levels of capability and tank commanders who see their own disaggregated capabilities). Though not crucial in all simulations or all engagements, such inaccuracies do limit the fidelity of simulations and will become more significant as simulations move toward incorporating individual warriors and participants. Improved methods of aggregation and disaggregation that preserve more state information (possibly in a standardized format) could minimize the amount of information the simulation must retain while preserving greater fidelity and consistency.
Another area in which DOD and the entertainment industry have overlapping interests is in developing technology for incorporating spectators into models and simulations. As Jacquelyn Ford Morie noted during the workshop, not everyone involved in digital forms of entertainment will want to be direct participants. Some will prefer to engage as a spectator, similar to sports such as baseball, football, and tennis in which only a small percentage of the participants actually play in a match and much of the industry is built around the fans. Morie believes that "there is a potentially huge market to be developed for providing a substantial and rewarding spectator experience in the digital entertainment realm" (see position paper by Morie in Appendix D). As Morie notes, being a spectator does not necessarily mean being passive; it is about being a participant with anonymity in a crowd, providing a less threatening forum in which people can express themselves.
DOD has already expressed an interest in this type of capability. The role of the "stealth vehicles" has become increasingly important in defense simulations. Such vehicles are essentially passive devices that allow observers to navigate in virtual environments, attach to objects in the environments, and view simulated events from the vantage point of the participant. As multiplayer games become more sophisticated and interesting, such a capability may evolve into a spectator facility that will allow novices to observe and learn from master practitioners. Popular games may evolve to the level of current professional sports with teams, stars, schedules, commentators, and spectators.
Tools for Creating Simulated Environments
Another area in which DOD and the entertainment industry have common interests is in the development of software and hardware tools for creating simulated environments. Such tools are used to create and manipulate databases containing information about virtual environments and the objects in them, allowing different types of objects to be placed in a virtual environment and layers of surface textures, lighting, and shading to be added. For games this may be a 3D world that is realistic (such as a flight simulator) or fantastic (like a space adventure), in which an individual interacts directly with the synthetic world and its characters. For film and television, simulated models are often used as primary or secondary elements of scenes that involve real actors, while in other cases the entire story is built around synthetic characters, be they traditional two-dimensional (2D) animations or more advanced 3D animations. For
DOD these worlds are synthetic representations of the battle space (ground, sea, and air) and virtual representations of military systems.
Sophisticated hardware and software tools for efficiently constructing large complex environments are lacking in both the defense and entertainment industries. At the workshop Jack Thorpe of SAIC stated that existing toolsets are quirky and primitive and require substantial training to master, often prohibiting the designer from including all of the attributes desired in a simulated environment (see position paper by Thorpe in Appendix D). Improved tools would help reduce the time and cost of creating simulations by automating some of the tasks that are still done by hand. Alex Seiden, of Industrial Light and Magic, claims that software tools are the single largest area in which attention should be focused. Animators and technical directors for films face daunting challenges as shots become more complicated and new real-time production techniques are developed to model, animate, and render synthetic 3D environments for film and video.
Entertainment Applications and Interests
For digital film and television, special effects and animation are performed during the preproduction and postproduction processes. Preproduction brings together many different disciplines, from visual design to story boarding, modeling to choreography, and even complete storyboard simulation using 2D and 3D animations. Postproduction takes place after all of the content has been created or captured (live or otherwise) and uses 2D and 3D computer graphics techniques for painting, compositing, and editing. Painting enables an editor to clean up frames of the film or video by removing undesirable elements (such as deleting a microphone and props that were unintentionally left in the scene or an aircraft that flew across the sky) or enhancing existing elements. Compositing systems enable artists to seamlessly combine multiple separate elements, such as 3D models, animations, and effects and digitized live-action images into a single consistent world. Matched lighting and motion of computer graphics imagery (CGI) are critical if these digital effects are to be convincing.
In the games world the needs for content-creation tools are similar. Real-time 3D games demand that real-world imagery, such as photographic texture maps, be combined quickly and easily with 3D models to create the virtual worlds in which pilots fly. In the highly competitive market that computer game companies face, time to market and product quality are major factors (along with quality of game play) in the success of new games. This challenge has been eased somewhat in the past few years as companies have begun offering predefined 3D models and tex-
tures that serve as the raw materials that game and production designers can incorporate into their content.
Despite the enormous cost savings that can be enjoyed from automating these processes, entertainment companies invest little in the development of modeling and simulation tools. Most systems are purchased directly from vendors.47 Film production companies using digital techniques and technologies tend to write special-purpose software for each production and then attempt to recycle these tools and applications in their next production. Typically, little time or funding is available for exploring truly innovative technologies. The time lines for productions are short, so long-term investments are rare. Leveraging commercial modeling and animation tools from both the entertainment world (Alias | Wavefront, Softimage, etc.) and DOD simulation (Multigen, Coryphaeus, Paradigm Simulation) is starting to form a bridge between the entertainment industry and DOD.
DOD Applications and Interests
DOD faces an even greater challenge in its modeling and simulation efforts. Because of the large number of participants in defense simulations, the department requires larger virtual environments than the entertainment industry and ones in which users can wander at their own volition (as opposed to traditional filmmaking in which designers need to create only those pieces of geometry and texture that will be seen in the final film). Beyond training simulations, content-creation tools are potentially useful in creating simulations of proposed military systems to support acquisition decisions. DOD could use such models to prototype aircraft, ships, radios, and other military systems. The key would be linking conceptual designs, computer-aided engineering diagrams, analysis models, or training representations into a networked environment that would enable DOD to perform "what if?" analyses of new products. Finding some way to allow these varied types of data to fit into a common data model would greatly facilitate this process.
Like the entertainment industry, DOD lacks affordable production tools to update simulation environments and composite numerous CGI elements. While its compositing techniques are useful and efficient for developing certain types of simulation environments, they cannot handle the complexity demanded by some high-fidelity applications. Some models and simulation terrain must be built and integrated using motion, scale, and other perceptual cues. Here, DOD personnel encounter problems similar to those of entertainment companies that set up, integrate, and alter CGI environments. Human operators can be assisted by appropriate interactive software tools for accomplishing these iterative tasks.
Having better tools to integrate and create realistic environments could play a major role in the overall simulation design of training systems, exploring simulation data, and updating simulation terrain. Interactive tools could empower more individuals to participate in this process and would increase strategic military readiness.
Database Generation and Manipulation
Both the entertainment industry and DOD have a strong interest in developing better tools for the construction, manipulation, and compositing of large databases of information describing the geography, features, and textures of virtual environments. Simulations of aircraft and other vehicles, for example, require hundreds or thousands of terrain databases; filmmakers often need to combine computer-generated images with live-action film to create special effects. Most existing systems for modeling and computer-aided design cannot handle the gigabyte and terabyte data sets needed to construct large virtual worlds. As Internet games companies begin to develop persistent virtual worlds and architectural, planning, and military organizations develop more complete and accurate models of urban environments, the need for software that can create and manipulate large graphics data sets will becoming more acute. At DOD the data used to create these databases are typically captured in real time from a satellite and must be integrated into a completed database in less than 72 hours to allow rapid mission planning and rehearsal.
Today's modeling tools can be very powerful, allowing users to create real-time models with texture maps and multiple levels of detail using simple menus and icons. Some have higher-level tools for creating large complex features, such as roadways and bridges, using simple parameters and intelligent modeling aids. At the assembly level, new tools use virtual reality technology in the modeling stage to help assemble large complex environments more quickly and intuitively. Still, modeling tools have not gotten to the point of massive automation. There are some automated functions, but overall responsibility for feature extraction, creation, and simplification is in the hands of the modeler. More research is needed in this area.48
Bill Jepson from UCLA is exploring systems for rapidly creating and manipulating large geo-specific databases for urban planning. With a multidisciplinary research team, he has designed a system capable of modeling 4,000 square miles of the Los Angeles region. It uses a client-server architecture in which several multiterabyte databases are stored on a multiprocessor system with a server. Communications between
client and server occur via asynchronous transfer mode, at about 6 megabytes per second. Actual 3D data are sent to the client based on the location of the observer, incorporating projections of the observer's motion. Additional research is under way to link this system with data from the Global Positioning System so that the motions of particular vehicles, such as city buses, can be tracked and transmitted to interested parties. Similar systems could be useful for the Secret Service or the Federal Bureau of Investigation for security planning or for U.S. Special Forces or dismounted infantry training operations in a specific geographic locale. Other work at the University of California, Berkeley, is exploring the automatic extraction of 3D data from 2D images.49 These methods are likely to play a large role in the future in the rapid development of realistic 3D databases.
Another area of possible interest to both the entertainment industry and DOD is in the development of technologies that allow image sequence clips to be stored in a database. This would permit users in both the defense and entertainment communities to rapidly store and retrieve video footage for use in modeling and simulation. A prototype system has been developed by Cinebase, a small company working with Warner Brothers Imaging Technology. Additional development is required to make the technology more robust and widely deployable.
Additional efforts to develop more standardized formats for storing the information contained in 3D simulated environments would be beneficial to both DOD and the entertainment industry. A standard format could be developed that allows behaviors, textures, sounds, and some forms of code to be stored with an object in a persistent database. Such efforts could build on the evolving VRML standard. The goal is to devise a common method for preserving and sharing the information inherent in 3D scenes prior to rendering.50
Both DOD and the entertainment industry are interested in software tools that will facilitate the process of combining (or compositing) visual images from different sources. Such tools must support hierarchy and building at multiple levels of detail: they must allow a user to shape hills, mountains, lakes, rivers, and roads as well as place small items, such as individual mailboxes, and paint words on individual signs. They must also allow designers to develop simulated environments in pieces that can be seamlessly linked together into a single universe. This need will become more acute as the scale of distributed simulations grows. Existing computer-aided design tools do not have the ability to easily
add environmental features, such as rain, dust, wind, storm clouds, and lightning, to a simulated scene.
There are many unsolved compositing problems in pre- and postproduction work for filmmaking that are directly related to simulation and modeling challenges. For example, a need exists for postproduced light models for digital scenes and environments. To create appropriate lighting for composited realistic live-action scenes, lighting models must affect digitized images that were captured under variable lighting conditions. Such a simulation problem is encountered when realistic photographic data are composited into simulation data and the lighting must be interactively adjusted from daylight to night during persistent simulations. Here, it is necessary to develop lighting models that image-process photographic data to provide postproduced lighting adjustments after scenes have been captured. Solutions to these problems do not exist, yet the research would be applicable to both the entertainment industry and DOD.
Opportunities may exist for DOD and the entertainment industry to share some of the advances they have made in designing systems for creating models and simulation. DOD might be able to use some of the advanced compositing techniques that have been developed by the entertainment industry to integrate live-action video with computer graphics models. The entertainment industry's software techniques for matching motion and seamlessly integrating simulated scenes into a virtual environment might also be beneficial to DOD. However, most entertainment software is extremely proprietary. It will be necessary to address proprietary issues and methods of information exchange before extensive collaboration can occur between the entertainment industry and DOD. Conversely, some DOD technologies might prove to be very beneficial for entertainment applications as well. At the workshop, Dell Lunceford, of DARPA, suggested that some of the technologies developed as part of DOD's Modular Semiautonomous Forces (ModSAF) program might be useful in creating some of the line drawings used in preproduction stages of filmmaking. ModSAF cannot support the detailed graphical animation needed for facial expressions, but it could facilitate the simpler earlier stages of production in which characters are outlined and a story's flow is tested.
Interactive tools that facilitate the creation of simulations and models and that can be used for real data exploration could be valuable to both the entertainment industry and DOD. The computer mouse and keyboard are extremely limited when creating CGI scenes, and individuals
are often impaired or constrained by these traditional input devices. A recent project of the National Center for Supercomputing Applications located at the University of Illinois at Urbana-Champaign resulted in an interactive virtual reality interface to control the computer graphics camera in 3D simulation space. The project created an alternative virtual reality computer system, the Virtual Director, to enhance human operator control and to capture, edit, and record camera motion in real time through high-bandwidth simulation data for film and video recording. This interactive software was used to create the camera choreography of large astrophysical simulation data sets for special effects in the IMAX movie, Cosmic Voyage. This project has proven to be valuable for film production as well as scientific visualization. Such uses of alternative input devices to explore and document very large data sets are nonexistent in commercial production because of the time line required to develop such technology, yet this type of tool is extremely important to solve many problems in the entertainment industry as well as DOD simulation and modeling.
As this chapter illustrates, the defense modeling and simulation community and the entertainment industry have common interests in a number of underlying technologies ranging from computer-generated characters to hardware to immersive interfaces. Enabling the two communities to better leverage their comparative strengths and capabilities will require that many obstacles be overcome. Traditionally, the two communities have tended to operate independently of one another, developing their own end systems and supporting technologies. Moreover, each community has developed its own modes of operation and must respond to a different set of incentives. Finding ways to overcome these barriers will present challenges on a par with the research challenges identified in this chapter.
1. For a more comprehensive review of research requirements for virtual reality, see National Research Council. 1995. Virtual Reality: Scientific and Technological Challenges, Nathaniel I. Durlach and Anne S. Mavor, eds. National Academy Press, Washington, D.C.
2. DOD has several ongoing programs to extend the military's command, control, communications, computing, intelligence, surveillance, and reconnaissance systems to the dismounted combatant. These include the Defense Advanced Research Projects Agency's Small Unit Operations Program, Sea Dragon, Force XXI, and Army After Next.
3. Latency is not the only factor that causes simulator sickness, and even completely
eliminating latency will not eliminate simulator sickness. See position paper by Eugenia M. Kolasinski in Appendix D.
4. This subsection is derived from a position paper prepared for this project by the Defense Modeling and Simulation Office; see Appendix D.
5. Sheridan, T.B. 1992. Telerobotics, Automation, and Human Supervisory Control. MIT Press, Cambridge, Mass.
6. For a more complete description of the SIMNET program see Van Atta, Richard, et al., 1991, DARPA Technical Accomplishments, Volume II: An Historical Review of Selected DARPA Projects, Institute for Defense Analyses, Alexandria, Va., Chapter 16; and U.S. Congress, Office of Technology Assessment, 1995, Distributed Interactive Simulation of Combat, OTABP-ISS-151. U.S. Government Printing Office, Washington, D.C., September.
7. U.S. Congress, Office of Technology Assessment, Distributed Interactive Simulation of Combat, p. 32, note 6 above.
8. Gilman Louie, Spectrum Holobyte Inc., personal communication, June 19, 1996.
9. Pausch, Randy, et al. 1996. "Disney's Aladdin: First Steps Toward Storytelling in Virtual Reality," ACM SIGGRAPH '96 Conference Proceedings: Computer Graphics. Association for Computing Machinery, New York, August.
10. RTime Inc. introduced an Internet-based game system in April 1997 that supports 100 simultaneous players and spectators. See RTIME News,Vol. 1, February 1, 1997.
11. The National Research Council's Computer Science and Telecommunications Board has another project under way to examine the extent to which DOD may be able to make better use of commercial technologies for wireless untethered communications. A final report is expected in fall 1997. Another project to examine DOD command, control, communications, computing, and intelligence systems was initiated in spring 1997.
12. Specifications for implementing multicast protocols over the Internet are outlined by S.E. Deering in "Host Extensions for IP Multicasting," RFC 1112, August 1, 1989, available on-line at http://globecom.net/ietf/rfcll2.html. See also Braudes, R., and S. Zabele, "Requirements for Multicast Protocols," RFC 1458, May 1993.
13. As such, multicast stands in contrast to broadcast, in which one designated source sends information to all members of the receiving community, and to unicast systems in which a sender transmits a message to a single recipient.
14. This capability is called routing spaces. It will permit objects to establish publish regions to indicate areas of influence and subscription regions to indicate areas of interest. When publish and subscription regions overlap, the RTI will cause data to flow between the publishers and the subscribers. The goal of this effort, and the larger Data Distribution Management Project, of which it is part, is to reduce network communications by sending data only when and where needed. See Defense Modeling and Simulation Office, HLA Data Distribution Management: Design Document Version 0.5, Feb. 10, 1997; available on-line at http://www.dmso.mil/projects/hla/.
15. Internet Engineering Task Force, "Large Scale Multicast Applications (Isma) Charter," available on-line at http://www.ietf.org/html.charters/lsma-charter.html.
16. Much of the material in this section is derived from a position paper prepared for this project by Will Harvey of Sandcastle Inc.; see Appendix D.
17. Deployment of a new algorithm for queue management, called Random Early Detection, may help greatly reduce queuing delays across the Internet.
18. Floyd, S., and V. Jacobson. 1993. "Random Early Detection Gateways for Congestion Avoidance," IEEE/ACM Transactions on Networking 1(4):397-413; Wroclawski, J. 1996. "Specification of the Controlled-Load Network Element Service," available on-line as ftp://ftp.ietf.org/internet-drafts/draft-ietf-intserv-ctrl-load-svc-03.txt.
19. Clark, D. 1996. "Adding Service Discrimination to the Internet," Telecommunications Policy 20(3):169-181.
20. Sandcastle Inc., an Internet-based game company, is one source of research on synchronization techniques.
21. DOD defines modeling and simulation interoperability as the ability of a model or simulation to provide services to and accept services from other models and simulations and to use the services so exchanged to enable them to operate effectively together. See U.S. Department of Defense Directive 5000.59, "DOD Modeling and Simulation (M&S) Management," January 4, 1994, and U.S. Department of Defense, Under Secretary of Defense for Acquisition and Technology, Modeling and Simulation (M&S) Master Plan, DOD 5000.59-P, October 1995.
22. All participants in a simulation do not need an identical representation of the environment. Individual combatants, for example, will differ from fighter pilots in the amount of terrain they can see and the sensor data (radar, infrared, etc.) available to them. The key is ensuring that their views of the environment are consistent with one another (e.g., that all players would agree that a given line of trees obstructs the line of sight between two participants in the simulation).
23. DIS conveys simulation state and event information via approximately 29 PDUs. Four of these PDUs describe interactions between entities such as tanks and personnel carriers; the remainder transmit information on supporting actions, electronic emanations, and simulation control. The entity state PDU is used to communicate information about a vehicle's current position, orientation, velocity, and appearance. The fire PDU contains data on weapons or ordinance that are fired or dropped. The detonation PDU is sent when a munition detonates or an entity crashes. The collision PDU is sent when two entities physically collide. The structure of each PDU is regimented and changed only after testing and subsequent discussion at the biannual DIS workshops convened by the Institute for Simulation and Training at the University of Central Florida.
24. Macedonia, Michael R. 1995. "A Network Software Architecture for Large-Scale Virtual Environments." Ph.D. dissertation, Naval Postgraduate School, June; available from the Defense Technical Information Center, Fort Belvoir, Va.
25. Defense Modeling and Simulation Office, HLA Management Plan: High-Level Architecture for Modeling and Simulation, Version 1.7, April 1, 1996.
26. The Navy alone has over 1,200 simulation systems that do not currently comply with HLA. A compliance monitoring reporting requirement and waiver process, similar to the Ada waiver process, were put into place. Each affected service is to fund retrofits of simulation systems from their own budgets.
27. Ordering information is available on the DMSO Web site at http://www.dmso.mil.
28. The Computer Science and Telecommunications Board workshop provided an opportunity for representatives from Internet game companies to learn more about HLA. Several agreed to review the specifications to see if they would be applicable to them
29. Lantham, Roy. 1996. "DIS Workshop in Transition to. . . What?," Real Time Graphics 5(4):4-5.
30. National Research Council. 1995. Virtual Reality: Scientific and Technological Challenges, Nathaniel I. Durlach and Anne S. Mavor, eds. National Academy Press, Washington, D.C.
31. Macedonia, Michael R., et al. 1995. "Exploiting Reality with Multicast Groups," IEEE Computer Graphics & Applications, September, pp. 38-45.
32. Brutzman, Don, Michael Zyda, and Michael Macedonia. 1996. "Cyberspace Backbone (CBone) Design Rationale," paper 96-15-99 in Proceedings of the 15th Workshop on Standards for DIS, Institute for Simulation and Training, Orlando, Florida; Brutzman, Don, Michael Zyda, Kent Watsen, and Michael Macedonia. 1997. "Virtual Reality Transfer Protocol (vrtp) Design Rationale," accepted for the Proceedings of the IEEE Sixth International
Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE '97), held June 18-20, 1997, at the Massachusetts Institute of Technology, Cambridge, Mass.
33. Brutzman et al., 1996, "Cyberspace Backbone (CBone) Design Rationale," and Brutzman et al., 1997, "Virtual Reality Transfer Protocol (vrtp) Design Rationale," note 32 above.
34. Macedonia, "Exploiting Reality with Multicast Groups," note 31 above.
35. A standard 14.4-kilobit-per-second modem can transmit or receive a standard DIS packet in approximately 80 milliseconds, meaning that only about five players can participate in a real-time interactive game if each must send and receive messages (updating positions, velocities, etc.) to and from each other player at each stage in the game and latencies must be kept below 100 milliseconds.
36. From this perspective the code base is like the way a television studio thinks of it sets, props, and sound stages. The code base needs to be able to be data driven so that new episodes can be created in less than a week instead of a couple of years. Programming will be developed using scripting tools that allow writers and designers to quickly develop new stories. These tools will be important to help the writers and designers not only create new environments but also direct automated units and characters to "perform" new roles for the new scenarios.
37. For example, a player may be flying an F-15 along with a wingman when a pair of enemy MiGs engages them in battle. As a player breaks into a turn, he or she may realize that the wingman has disconnected (intentionally or unintentionally) from the game.
38. The Aladdin attraction is something of an anomaly in that Walt Disney Imagineering approached it not only as a theme park attraction but also as scholarship. It published results of its research in the open literature. See Pausch, Randy, et al. 1996. "Disney's Aladdin: First Steps Towards Storytelling in Virtual Reality," ACM SIGGRAPH '96 Conference Proceedings: Computer Graphics. Association of Computing Machinery, New York, pp. 193-203.
39. Fryer, Bronwyn, "Hollywood Goes Digital," available on-line at http://zeppo.cnet.com/content/Features/Dlife/index.html.
40. Ditlea, Steve. 1996. "'Virtual Humans' Raise Legal Issues and Primal Fears," New York Times, June 19; available on-line at http://www.nytimes.com/library/cyber/week/0619humanoid.html.
41. Magneanat Thalmann, N., and D. Thalmann. 1995. "Digital Actors for Interactive Television," Proceedings of the IEEE, August.
42. An agent that could meet this requirement would satisfy the "Turing test." Alan Turing, a British mathematician and computer scientist, proposed a simple test to measure the ability of computers to display intelligent behavior. A user carries on an extended computer-based interaction (such as a discussion) with two unidentified respondentsone a human and the other a computer. If the user cannot distinguish between the human and the computer responses, the computer is declared to have passed the Turing test and to display intelligent behavior.
43. Chandrasekaran, Rajiv. 1997. "For Chess World, A Deep Blue Sunday: Computer Crushes Kasparov in Final Game," Washington Post, May 12, p. Al.
44. U.S. Congress, Office of Technology Assessment. 1995. Distributed Interactive Simulation of Combat, OTA-BP-ISS-151. U.S. Government Printing Office, Washington, D.C., September, pp. 123-125.
45. Genetic algorithms are computer programs that evolve over time in a process that mimics biological evolution. They can evolve new computer programs through processes analogous to mutation, cross-fertilization, and natural selection. See Holland, John H. 1992. "Genetic Algorithms," Scientific American, July, pp. 66-72.
46. The National Research Council is conducting another project on the representation of human behaviors in military simulations. See National Research Council. 1997. Repre-
senting Human Behavior in Military SimulationsInterim Report, Richard W. Pew and Anne S. Mavor, eds. National Academy Press, Washington, D.C.
47. Paul Lypaczewski of Alias | Wavefront estimates that the market for off-the-shelf modeling and simulation tools is about $500 million per year.
48. See National Research Council, Virtual Reality, note 30 above.
49. Debevec, P.E., C.J. Taylor, and J. Malik. 1996. "Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-based Approach," Proceedings of SIGGRAPH '96: Computer Graphics. Association of Computing Machinery, New York, pp. 11-20.
50. The process of rendering computer graphics is the process of making frames from objects with motion so they can be displayed by the computer or image generator.