A grand challenge for science is to understand the human implications of global environmental change and to help society cope with those changes. Virtually all the scientific questions associated with this challenge depend on geospatial information (geoinformation) and on the ability of scientists, working individually and in groups, to interact with that information in flexible and increasingly complex ways. Another grand challenge is how to respond to calamities—terrorist activities, other human-induced crises, and natural disasters. Much of the information that underpins emergency preparedness, response, recovery, and mitigation is geospatial in nature. In terrorist situations, for example, origins and destinations of phone calls and e-mail messages, travel patterns of individuals, dispersal patterns of airborne chemicals, assessment of places at risk, and the allocation of resources all involve geospatial information. Much of the work addressing environment- and emergency-related concerns will depend on how productively humans are able to integrate, distill, and correlate a wide range of seemingly unrelated information. In addition to critical advances in location-aware computing, databases, and data mining methods, advances in the human-computer interface will couple new computational capabilities with human cognitive capabilities.
This report outlines an interdisciplinary research roadmap at the intersection of computer science and geospatial information science. The report was developed by a committee convened by the Computer Science and Telecommunications Board of the National Research Council, in response to requests from the National Science Foundation, the National Aeronautics and Space Administration, and the Environmental Protec-
tion Agency. The scenarios and examples in the report illustrate the exciting opportunities opening up as research enhances the accessibility and usability of geospatial information. The recommendations for research investments were derived from suggestions presented at a workshop (details are available in the Preface and in Appendix B), white papers submitted by workshop participants, and input from several outside experts.
WHY RESEARCH IS NEEDED NOW
Imagine a world in which geospatial information is available to all who need it (and who have permission to use it) in a timely fashion, with a user friendly interface and (if wanted) integrated in a real-world context. As the volume of geospatial data (geodata) increases by several orders of magnitude over the next decade, so will the potential for corresponding advances in knowledge of our natural world and in our ability to react to the changes taking place. The information distilled from these data will enable more productive environmental and social science, better business decisions, more effective urban and regional planning and environmental management, and better-informed policy making at all levels, from the local to the global.
The evolution of location sensing, mobile computing, and wireless networking is creating a new class of computing. Location-aware computing systems behave differently according to the user’s location. They operate either spontaneously (e.g., warning of a nearby hazard) or when activated by a user request (e.g., where is the nearest police station?). Sensors that record their location and some information about the surrounding environment (e.g., temperature and humidity) are being deployed to monitor the state of roads, buildings, agricultural crops, and many other objects. For example, Smart Dust sensors (devices that combine microelectromechanical sensors with wireless communication, processing, and batteries into a package about a cubic millimeter in size) can be deployed along remote mountain roads to determine the velocity and direction of passing vehicles or can be attached to animals to record where they travel. The data transmitted wirelessly in real time from such location-sensing devices are growing not only in volume but also in complexity. Advances in location-aware computing could greatly affect how geospatial data are acquired, how and with what quality they can be delivered, and how mobile and geographically distributed systems are designed.
Our ability to generate new geospatial data already outpaces our ability to organize and analyze them. To address this situation, the technologies for geospatial databases and data mining must advance as well. Integration of geospatial data is problematic owing to the myriad formats,
conventions, and semantics. Current database technologies are limited in their ability to represent spatiotemporal objects (e.g., objects that move and evolve over time, sometimes appearing or disappearing at irregular intervals). There are similar problems with the analysis and evaluation of geospatial data, because methods from data mining have so far been based largely on transactional and documentary data, not on complex, highly dimensioned data representing objects that may be undergoing continuous change.
The sheer volume and complexity of geospatial information create two paradoxes. First, the very people who could leverage this information most effectively, such as policy makers and emergency response teams, often cannot find it or use it because they are not specialists in geospatial information technology. Second, as the availability of the needed information grows, so, too, will the difficulty of using that information effectively. New technologies are needed to support human interaction with geospatial information. More specifically, technologies should be devised that can help individuals and groups access such information, visually explore and construct knowledge from it, and apply the knowledge to critical problems facing both science and society. Ways should be found of adapting those technologies to the needs of ordinary citizens of all ages, interests, and physical abilities (vision, manual dexterity, etc.) as well as all degrees of familiarity with computers and databases.
The committee translated its key findings and conclusions into a number of research themes. Some of the research would address issues raised by the intrinsic characteristics of geospatial data. Other research would have broader applicability in computer science, but applications involving geospatial data also would benefit significantly from advances in this area. All research would aim at improving the performance, accessibility, and usability of geospatial information. Recommendations for research are summarized below, in roughly the order they appear in the report. Two overarching issues are presented first.
Research at the Intersection of Information Technology and Geospatial Science
To make any significant progress in geospatial applications, the research community must adopt an integrative, multidisciplinary approach. One of the greatest hindrances to benefiting from the massive amounts of geospatial data already being collected is the fragmented nature of current research efforts. Most of the research on the accessibility, analysis,
and use of geospatial data has been conducted in isolation within single disciplines (e.g., computer science, geography, statistics, environmental science) or within subdisciplines of computer science (e.g., databases, theory, algorithms, visualization, human-computer interfaces). A multidisciplinary approach would make it more likely that the right research problems are identified and that they are addressed in ways that will respond to the most pressing needs.
As geospatial information becomes more and more widely used, major ethical, legal, and sociological concerns are likely to arise. They include such concerns as the liability of providers of data and software tools, intellectual property rights, the rules that should govern information access and use, and the protection of privacy. When government agencies or programs have collected geodata, there may be additional constraints— for instance, the protection of national security, limitations on the release of data that could compete with commercially provided data, and the cost of preparing data for public release. Moreover, policies and capabilities may vary considerably from place to place or country to country. It is not clear whether policy and technical mechanisms can be coordinated so as to realize the potential benefits of geospatial information while avoiding undesirable social costs.
Coordination would also be valuable in the matter of funding, perhaps through federal government support for an open framework for geospatial data. The committee believes that location information needs to become a public commodity to motivate scenarios such as the ones outlined in Chapter 1. However, the policy and social implications of improving the accessibility of geospatial information will depend on complementary mechanisms, such as advanced technical support for reliable user identification and authentication to guarantee privacy and security. The committee believes that an in-depth analysis is needed of the policy and social implications of the increased availability of geospatial information for data acquisition, access, and retention policies and practices.
Accessible Location-Sensing Infrastructure
Advances in the technologies for data acquisition and data access are enabling more and more applications of geospatial data and location-based services. The Global Positioning System and other localization technologies, wireless communication, and mobile computing are key components. Although progress has been made in these areas, a significant
amount of additional research is needed before location-aware computing can become commercially viable. The emergence of new commercial infrastructure will drive new kinds of research, which in turn will lead to new commercial opportunities.
The development of innovative applications that use location sensing will foster location-aware computing. Key research opportunities include the development of common standards for location-sensing application programming interfaces (APIs); techniques for reducing the costs of deploying and managing location-sensing infrastructure; the development of platform-independent descriptions of capabilities for mobile clients, servers, and middleware; scalability; static-mobile load balancing; adaptive resource management; and the mediation of requests. The creation of an open, widely deployed test bed could enable collaborative research on location-aware computing infrastructure. Such an effort would be similar to the early research efforts that led to the Internet and the World Wide Web.
Freeing users from desktop computers and physical connections to a network will bring geospatial information into real-world contexts and could revolutionize how humans interact with the world around them. The ability to obtain information on demand, wherever a user happens to be, cannot be realized without new technologies and methods specifically accommodating user mobility. Mobile environments typically will be resource-poor and physically constrained and will exhibit variable and unpredictable intensities of resource use. Research is required to develop adaptation techniques that will allow applications to degrade gracefully when resources such as bandwidth or battery power become scarce.
The ability to manage information about the availability of resources based on proximity is an enabling technology for many of the applications discussed in this report. It would include reliable and cost-effective techniques for discovering resources as they come in and out of service, partitioning and off-loading computation, and delivering information to caching sites near current or predicted future locations. Protocols and mechanisms to authenticate and certify the location of an individual at any given time also will be required, as will adaptation techniques for handling situations when location information becomes stale, is unavailable, or is deliberately withheld. Query language extensions will be required to allow applications to refer to future events and to support automatic triggers, such as “car navigator should inform the driver when a hospital is within 10 miles.”
Geospatial Data Models and Algorithms
Existing database techniques do a poor job of representing the complexities of geographic objects and relationships. Discrete representations for objects that span a region in space and time are inadequate and can result in inconsistencies and uncertainties. Data models, query languages, indexes, and algorithms must be extended to handle more complex geometric objects, such as objects that move and evolve continuously over time. Integrating the temporal characteristics of a geographic object into a spatiotemporal database is challenging. Research is needed to develop query languages that can reference not only the past known locations of objects but also their predicted future locations. Novel indexing schemes must be developed that can handle properties of geospatial data such as continuous evolution and uncertainty.
Advances must be made in algorithms for geospatial data as well. Most algorithm research is conducted in a theoretical framework. Perfect data are assumed, so the algorithms may not function correctly and efficiently in real-world geodata applications. Cache-oblivious and I/O-efficient algorithms have the potential to solve complicated problems using massive geospatial data sets more efficiently. Another area for research investment is kinetic data structures, which could efficiently represent objects that move and evolve continuously.
There is no formal, comprehensive semantic model of geospatial information. The development of formalized ontological frameworks for geospatial phenomena is a critical area for multidisciplinary research investments. The integration of geospatial information would benefit (as would the many sciences that use such information to link their objectives) from an approach that focuses on defining the essential intersections (i.e., the concepts they share). The committee believes the development of domain-specific ontologies would be beneficial, as well as tools for maintaining them and for reconciling them where they overlap. Research also is needed to capture, represent, and effectively communicate the dynamic semantics of geospatial data to the users of the data.
The integration (or conflation) of geospatial information from multiple sources—often with varied formats, semantics, precision, and coordinate systems—is an important research topic. A key issue for integrating spatial data is a formal method that bridges disparate ontologies (e.g., by using spatiotemporal association properties to relate categories from different ontologies) to make such knowledge explicit in forms that would be useful to other disciplines. Handling different kinds of imprecision
and uncertainty is an important research topic: most important, for geospatial data integration in particular, different data sets may be described with different types and degrees of inaccuracy and imprecision, which can seriously compromise the integration of information.
Geospatial Data Mining
Many spatiotemporal data sets contain complex data that exhibit very high dimensionality and spatial autocorrelation. Applying traditional data mining techniques to geospatial data can lead to patterns that are biased or that do not fit the data well. A key challenge in improving the accessibility and usability of geospatial information is to develop a software system that could assist the human expert in the data mining process by locating relevant spatiotemporal data sets, process models, and data mining algorithms; identifying appropriate fits; performing conversions when necessary; applying the models and algorithms; and reporting the resulting patterns (e.g., correlations, regularities, outliers).
Research investments will be required to develop dimensionality reduction methods that are scalable, robust, and nonlinear. Moreover, few current data mining algorithms can handle temporal dimensions, and even fewer can accommodate spatial objects other than points. Research must be directed at new techniques that will be capable of finding patterns in complex geospatial objects that move, change shape, evolve, and appear/disappear over time. To be widely accessible and useful, the results must be reported in a language that requires only minimal statistical and information technology expertise.
Geospatial Interaction Technologies
Increases in data resolution, volume, and complexity can overwhelm human capacities to process information using traditional visual display and interface devices. Recent advances in display and interaction technologies are encouraging, but the resolutions of desktop and mobile systems are still far below what is needed for the kinds of scenarios described in this report. Inexpensive, large-screen, high-resolution display devices are needed, at prices affordable by classrooms, science laboratories, regional planning offices, and so forth. Mobile display technology also must be advanced significantly.
A key barrier to progress has been the absence of a comprehensive framework for understanding human interaction with geospatial information that cuts across technological and disciplinary boundaries. The development of such a framework constitutes a significant challenge. What makes such a framework difficult but essential is that geospatial
information comprises such a wide range of phenomena and their characteristics. This range includes continuous fields that are visible (terrain) and invisible (temperature), objects that are constructed (buildings) and natural (lakes), abstract features that are precise (political boundaries) and imprecise (forests), as well as ill-defined concepts such as drought, poverty, disease clusters, or climate anomalies.
Basic research also is needed to address the larger issue of information perceptualization—that is, how to represent extremely complex information using surface texture and sound as well as visual attributes. Methods and algorithms are needed that support more natural and direct manipulation of high-resolution displays of very large data sets and of complex process models in real time. Another challenge is to devise richer representation of the uncertainty in geospatial data sets that incorporates spatial autocorrelation. Even the most basic concepts, such as what the appropriate balance might be between realism and abstraction, have not been established. Yet clear guidelines are needed if we are to be successful in depicting highly complex, multivariate, multiscale, time-varying geospatial information in ways that facilitate human understanding.
Geospatial for Everyone, Everywhere
As geodata become widely available, the new technologies must be adapted to the needs of ordinary citizens. Providing more people with access to the vast geospatial resources being assembled by government and private organizations could mean a much better informed citizenry, with attendant benefits for public policy. Research is needed to determine what kinds of metadata will be most useful for general access, how to generate them in a comprehensive way, and how to present them most effectively. Investments also are needed to develop intelligent interfaces that can accommodate the requirements of particular sets of users. Such interfaces would adapt themselves to user needs, remember how to find information when it is needed again, and become smarter over time at anticipating user needs and requests.
New technologies and methods will have to be devised to accommodate user mobility if people are to obtain information on demand wherever they happen to be. This requires not just flexible and cost-effective mobile devices, but also context-sensitive representation of geoinformation that is subject to continual updating from multiple sources. Additional research investments will be needed to exploit the potential of mobile augmented reality, which uses information about the user’s immediate environment to enhance what the user is physically capable of seeing or hearing.
Collaborative Interactions with Geoinformation
Most decision making is the product of collaborative teams. The core challenge in geospatial collaboration is to support such work by means of technologies such as group-enabled geographic information systems, team-based decision support systems, and collaborative geovisualization. Research building on generic effort is needed to understand the basis for collaborative interactions with geoinformation—particularly when access rights and expertise vary widely among team members—and the design principles for making such activities most productive. One problem is that collaborations often take place over large distances. Teleimmersion and other virtual environment technologies must be explored to determine how human-scale “spaces” can be used to deal with geographic-scale problems. It also will be necessary to develop geocollaboration systems that permit participation from field sites, where bandwidth, power, and display capabilities will be highly constrained. Systems to support group decision making will need to simulate the outcomes of alternatives. For this they will require the ability to incorporate knowledge distillation and computational models. In emergency-response situations, these capabilities must be available in real time.
The convergence of advances in location-aware computing, databases and knowledge discovery, and human interaction technologies, combined with the increasing quality and quantity of geospatial information, can transform our world. Diverse technological advances will be needed to attain that goal, and we must marshal the talent and resources needed to achieve those advances. Only by maintaining the long-term view that geospatial information should be made accessible to everyone, everywhere, in appropriate and useful ways, can we exploit the full potential of geospatial information for enriching science and safeguarding society. Computer science has a key role in realizing that vision.