BOX 1.2 What Are Geospatial Data?

Executive Order 12906 (which called for the development of a National Spatial Data Infrastructure) defines geospatial data as “information that identifies the geographic location and characteristics of natural or constructed features and boundaries on the earth.”1 Examples of geospatial data include a forest, a wildfire, a satellite image, addresses of homes and businesses, and the GPS coordinates of a car. Although time is considered to be a dimension of geospatial data (or “geodata”), the term “spatiotemporal data” often is used to emphasize data that vary over time or have a time-critical attribute. The extent of a wildfire as it burns is an example of spatiotemporal data.

Geodata are different from traditional types of data in key ways. Among the most important differences is that geodata are high dimensional (highly multivariate) and autocorrelated (i.e., nearby places are similar). Auto-correlation is a feature to be exploited (e.g., it allows predictions to be made about places for which there are no data) but it also prevents application of standard statistical methods.2 Some geospatial data contain distance and topological information associated with Euclidean space, whereas others represent non-Euclidean properties, such as travel times along particular routes or the spread of epidemics.

Digital representations of geospatial data are moving beyond the traditional, well-structured vector data (geometric shapes that describe cartographic objects) and raster data (digitized photographs and scanned images) formats. A more common conceptualization of geographic reality is based on the field and object representation models. The field model views geospatial data as a set of distributions over space (such as vegetation cover), whereas the object model represents the earth as a surface of discrete, identifiable entities (e.g., roads and buildings; Fonseca, Egenhofer, and Agouris, 2002). Some geospatial entities are discrete objects, whereas many others are continuous, irregularly shaped, and inexact (or fuzzy). For example, a storm is a continuous four-dimensional (4D) phenomenon but must be represented in digital form as a series of approximate discrete objects (e.g., extent, wind velocity, direction), resulting in uncertainty, errors, and reduced accuracy. An integrated conceptualization combining the field and object perspectives is increasingly important and necessary to represent, for example, a storm as an object at one scale and to model its structure as a field at a different scale.

The characteristics of geospatial data pose unique challenges to geospatial applications. The requirements of a geospatial data set—such as the coordinate system, precision, and accuracy—are often specific to



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement