4
Services and Functions

Library Services

Digital library developments are redefining the nature of the library, its services, and its limitations. The traditional library focuses on making it easy for the user to identify, find, browse, and retrieve the contents of a book or journal, but its responsibilities end when the item is in the user's hands. Although the contents of books and journals are essentially immutable, in a digital library the information provided is digital and readily manipulated. Many libraries today have holdings of geoinformation, or provide the means to obtain such data from other sites. Some provide geographic information systems and other tools for users who wish to manipulate or analyze data. Users who access data remotely over the Internet now often have a choice between downloading the data to be analyzed by their own software or sending queries and instructions for execution directly on the data's host. When applied to geospatial data, this remote processing is termed the GIServices model to distinguish it from the more traditional local processing of the GISystems model. For example, sites such as MapQuest (www.mapquest.com) use the GIServices model in providing driving instructions based on geospatial data because the analysis is performed by the host and no data are transmitted to the user. On the other hand, sites such as Microsoft's www.terraserver.com and various U.S. Geological Survey sites aim to provide data for local processing, following the GISystems model.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 53
4 Services and Functions Library Services Digital library developments are redefining the nature of the library, its services, and its limitations. The traditional library focuses on making it easy for the user to identify, find, browse, and retrieve the contents of a book or journal, but its responsibilities end when the item is in the user's hands. Although the contents of books and journals are essentially immutable, in a digital library the information provided is digital and readily manipulated. Many libraries today have holdings of geoinformation, or provide the means to obtain such data from other sites. Some provide geographic information systems and other tools for users who wish to manipulate or analyze data. Users who access data remotely over the Internet now often have a choice between downloading the data to be analyzed by their own software or sending queries and instructions for execution directly on the data's host. When applied to geospatial data, this remote processing is termed the GIServices model to distinguish it from the more traditional local processing of the GISystems model. For example, sites such as MapQuest (www.mapquest.com) use the GIServices model in providing driving instructions based on geospatial data because the analysis is performed by the host and no data are transmitted to the user. On the other hand, sites such as Microsoft's www.terraserver.com and various U.S. Geological Survey sites aim to provide data for local processing, following the GISystems model.

OCR for page 53
The WWW has made everyone a potential publisher and distributor of information, blurring old distinctions between authors, publishers, distributors, and librarians. The important library function of collection building, which involves the library staff in making careful decisions about what should or should not appear in the library, has no equivalent on the WWW, where there are no gatekeepers or custodians of quality. If library information assets can be accessed from anywhere, how will each library determine what to collect or acquire, if anything? In a digital world and barring direct control and restriction on access, a library will be able to leave more general resources to others and to emphasize those information assets that it alone is best qualified to provide. There would be little value, for example, in serving recent issues of a journal if the journal's publisher and other libraries already provide the needed service at no charge. Unique assets might include the products of the parent institution's own research and scholarship, unique information resources donated to the library by bequests, or information on the library's own local region. In short, the library of the future will be able to make a clear distinction between the services it provides in helping its users find, access, and use information and the information assets that it collects, builds, and maintains itself. Metadata, or data about data, are likely to become much more important, as libraries seek to refine the services they provide by including more and more tools designed to assist in search, evaluation, and use. Just as today's library needs a catalog that tells users where to look in its stacks for given information resources, so tomorrow's digital library will need the tools (cataloging, indexing, abstracting) that help users navigate the vast communications networks and distributed information resources of the future. This chapter addresses the services and functions of distributed geolibraries against this background of traditional and novel library services. As noted in Chapter 2, the functions and services of a library are often less obvious than and confused with its physical structure. Some, like information abstraction and

OCR for page 53
collection building, are less obvious than others, like the physical stacks or circulation desk. Some of the services discussed here have long historical antecedents, while others are entirely novel. Distributed Geolibrary Services A service can be defined generally as a provision of whatever is necessary for installation and maintenance of a machine, organization, or operation. Services for a machine such as a car include those found at a gas station or a mechanics shop. A small consulting organization might provide sales services to its clients, payroll and training services for its employees, and marketing or research services to maintain steady growth. The services of a distributed geolibrary fall into several categories, including services for search and retrieval of items of particular interest, item description and display services, data-processing services, and services for collection maintenance and growth. These classes of service relate to the four types of activity that go on in any library: (1) looking for specific books or other reference information by author, title, subject, or identifying code; (2) creation of the library catalog; (3) using various library tools to manipulate or interpret information; and (4) taking care of or improving the library collection. The nature of these services differs dramatically in a distributed geolibrary, however. The ability to manipulate data, and to integrate data from a number of sources, is greatly enhanced because all data are in digital form. While location was handled as one of a number of possible forms of subject in the traditional library, it is the primary basis of search in a distributed geolibrary. The distributed nature of the geolibrary also makes collection building far more challenging because there are no gatekeepers and no one is in charge of the entire collection. Moreover, a distributed geolibrary would offer something that is not possible in the traditional library, with its traditional form of catalog—the ability to search based on geographic location.

OCR for page 53
The power of this concept has already mobilized many individuals, groups, and agencies. For example, the Open GIS Consortium (www.opengis.org) issued a Request for Proposals in March 1998 on the subject of catalogs for geospatial data, anticipating that by doing so it would help move the community toward the development of interoperable catalog specifications. The consortium includes roughly 150 vendors, integrators, educators, and users, from both public and private sectors. The Need for Distributed Geolibrary Services There are three reasons for developing distributed geolibrary services. The first is economic. Traditionally, geospatial data have been distributed in the form of paper maps, disks, and tapes, which are costly to produce, slow and cumbersome to distribute, and difficult to update. To meet the national mandate to make data collected at public expense available to the public, federal agencies are looking for new ways to disseminate data more widely and effectively, primarily via the Internet (Jones, 1997). By utilizing the Internet and network communications, a distributed geolibrary could deliver online information services quickly and economically. Agencies and companies can also sell data and recover income more effectively using the Internet's growing and increasingly reliable tools for electronic commerce. Finally, encryption technologies could provide assurance against unauthorized use and distribution. The workshop was not an appropriate forum for the development of a comprehensive economic model of geoinformation dissemination or for detailed analysis of the costs and benefits of implementation. These are important issues and could be the focus of a useful and productive research effort. A good starting point would be a recent study by the National Academy of Public Administration (1998), which includes a comprehensive summary of what is known about the economics of geospatial data production and dissemination.

OCR for page 53
The second reason involves the decentralization of geoinformation management. In a distributed geolibrary there is no need for data to be collected in one place; instead, data can be held by a custodian until needed. Because the Internet provides universal access, it is sufficient that there be a custodian serving a given data set, and with a single server there are no problems maintaining consistency across copies if data must be updated. Ideally, the custodian would also be the person or agency responsible for updating the data and for assuring their accuracy. In practice, however, some mirroring of data may be needed to overcome the effects of network delays and server downtime (Worboys, 1995). A third reason for a distributed framework for geolibrary services is the demand for access. Public access to geoinformation, particularly by students, can support improvements in national levels of geographic literacy by making it possible for classes to obtain information quickly and easily about any part of the Earth's surface. Ready access to geoinformation about local areas (neighborhood, city, county, region) can help to develop a more informed citizenry and improve opportunities for participation in the democratic process (Adler, 1995; Craig, 1995). Services as Collections of Functions Services have been described using broad categories of response to demands. Functions are the actual commands or activities that implement services, and a given function may contribute to more than one service. A function can deliver all or part of a service. Functions that make up car services at the gas station include changing fluids, changing filters, inspecting brakes or tires, and so forth. At the mechanic shop, the service known as a tune-up would be comprised of functions such as changing spark plugs, adjusting engine timing or belt alignment, and so forth. Various efforts over the past few years have implemented limited functions of a distributed geolibrary. They include two of the projects of the National Science Foundation-National Aeronautics and Space Administration (NASA)-Defense Advanced Research

OCR for page 53
Projects Agency Digital Library Initiative (at the University of California's Berkeley and Santa Barbara campuses), efforts of the Federal Geographic Data Committee (FGDC) under the rubric of the NSDI, various state and local government projects; dissemination mechanisms developed by suppliers of Earth imagery, and numerous efforts in other countries. Some selected examples of these prototypes are described in Appendix D. Although there are sharp differences in approach and scope, there is now a degree of consensus on the functions that can best deliver the services of a distributed geolibrary. Necessary Distributed Geolibrary Functions Necessary functions for search and retrieval include searches by geographical location, searches by geographical place name, and searches by secondary requirements such as subject theme or time. Retrieval functions require a workspace to hold the items, criteria for sorting and ranking items depending on their assessed relevance to the user's needs, a tagging mechanism to select and retrieve specific items, and links to other functions for display and description. The following sections describe these in more detail. Search by Geographical Location The basemap provides the image of the Earth on which a user can specify areas of interest. Its level of geographic detail defines the most localized spatial search that is possible. It should include all of the features likely to be relevant to a user wanting to find and define a search area, including major topographic features and place names. The importance of such features will vary between users, as will levels of detail, so it will be necessary to establish protocols that allow use of specialized basemaps for particular purposes. For example, a hydrologist might want the basemap to emphasize hydrological features such as rivers and watersheds,

OCR for page 53
whereas a climatologist might want to see weather stations and topography. This function would first display a basemap, allowing users to point at a place to target either a specific point or a footprint. Users would be allowed to zoom to greater detail and to pan across the Earth's surface. Widgets such as the "rubber rectangle" would allow users to specify footprints in a number of ways. There should also be support for "fuzzy" footprints that are not precisely or crisply defined, allowing users to define approximate areas of search. There are many current examples of sites that support search by geographic location based on standard WWW browser software (e.g., Microsoft's Internet Explorer or Netscape's Navigator). Many (see, for example, the archive of digital orthophoto quadrangles at the Massachusetts Institute of Technology (ortho.mit.edu); other examples are listed in Appendix D) present the user with a map divided into tiles; by pointing to a tile the user accesses data for that tile's geographic area. The Alexandria Digital Library project's current prototype (alexandria.ucsb.edu) uses a Java application, including rubber rectangles and other tools. These prototypes use projected basemaps and do not yet implement a sense of interacting with the curved surface of the Earth, as suggested by the vision of distributed geolibraries, which would require three-dimensional visualization technologies such as VRML (Virtual Reality Modeling Language). The current Alexandria browser includes the ability to "paint" data onto the basemap; in Vice President Gore's vision of Digital Earth the user is able to "fly" through a full three-dimensional rendering of the Earth's physical environment. Several suitable sources of data exist for basemaps: Digital topographic data, available for the entire land area of the planet at 1:1,000,000 in the Digital Chart of the World, and for smaller areas at larger scales. For the continental United States the

OCR for page 53
USGS provides digital topographic data at 1:100,000 and for limited areas at 1:24,000.1 Imagery from space, available from the Landsat satellite at 30m resolution, from the French SPOT satellite at 10-m resolution, from Russian satellites at 2-m resolution, and anticipated in 1999 commercial satellite imagery for selected areas at 1-m resolution. Digital elevation data, available for parts of the United States. at 30-m resolution, and for the entire planet at 5-km resolution. Global coverage at 30-m resolution is planned. The costs of these data vary enormously; those from federal sources are available at the cost of reproduction, but other sources operate on a commercial basis. Search by Place Name Gazetteer is a technical term for an index that links place names to locations. As often found associated with published atlases and city maps, gazetteers provide links to map sheets and locations within map sheets. In the context of distributed geolibraries, a gazetteer connects place names to geographic coordinates. This connection allows the user of the distributed geolibrary to define a search area using a place name, instead of by finding the area on a basemap, which may be difficult to many users. The gazetteer may include place names that are not well defined. For use in a geolibrary a gazetteer must include extents, or digital representations of each place name's physical boundary. Links between place names allow searches to be expanded or narrowed—they can be vertical, 1   This ratio or representative fraction compares the distance between two points on a paper map with the distance between the same pair of points on the surface of the Earth. Digital data created from paper maps by digitizing and scanning are also characterized by this ratio, which also defines the set of features shown on the map and the degree of geometric generalization of those features. In rough terms a database created from a map with a given representative fraction depicts features larger than 0.5 mm across on the map and achieves a similar positional accuracy.

OCR for page 53
identifying places that include or are included by other places, and also horizontal, identifying neighboring places. Because a gazetteer is an essential building block of the distributed geolibrary and something that can be shared between large numbers of users, its availability is a critical factor in progress toward the vision of distributed geolibraries. At this time no one agency is identified as being responsible for production and maintenance of a common national or global gazetteer. Most gazetteers that exist, such as the USGS Geographic Names Information System (GNIS) or equivalent commercial products, provide in most cases only a central point for each feature, and their coverage of the world's place names is uneven. Progress would be aided by identification of the gazetteer as a fundamental component of the NSDI framework. Progress would also be aided by the development of a standard gazetteer protocol to ensure that users or groups of users who create their own specialized gazetteers could use them to access distributed geolibraries in place of general-purpose gazetteers. Additionally, there are significant problems to be overcome in dealing with varied alphabets, diacritical marks, ambiguities of spelling, place names with indeterminate boundaries, and so forth. Search by Subject Theme or Time Period In a physical library the card catalog indexes library holdings by subject domain. An electronic catalog may include a thesaurus, which matches synonyms of search topics, providing associations in a search query, for example, between "slough" and "swamp" and "wetland." For cataloging functions to work, items must be stored in a standard format, following an agreed protocol. Likewise, users must also specify searches in an agreed protocol; this is often accomplished by a query dialogue function, which converts a form-based user search request into whatever protocol is required. The basis of such protocols already exists in standards, e.g., MARC (MAchine Readable Cataloging, see lcweb.loc.gov/marc/) and the FGDC's Content Standards for Digital Geospatial Metadata (www.fgdc.gov), and in projects such as the Alexandria Digital

OCR for page 53
Library. The FGDC has also made progress in standardizing conventions for naming geographic features, and similar progress has been made in other countries. Distributed geolibraries should allow their users to narrow specifications of need by including subjects, dates, and other identifying characteristics, as well as needed level of geographic detail, and imposing them on the search in addition to geographic location. Although location is the primary key in searching a distributed geolibrary, other aspects allow the user to limit the number of items of geoinformation identified in a search to reasonable levels. Distributed geolibraries also should be capable of ranking items identified in a search by their suitability to the user's needs. They also should inform the user of the number of hits, and provide other ways of summarizing them in readily understood ways. Item Display and Description These functions include visualization tools and metadata browsing tools. Visualization tools are useful for displaying items retrieved from the archive. Geoinformation data sets are often massive, creating problems for users who may need to browse through many data sets to find one that is suitable for use, given the limited bandwidth of many Internet connections. In such cases it is clearly impossible to examine the full contents of each data set, and some system must be devised to allow users to examine a summary or generalized sketch of the contents that can be retrieved quickly. Display functions also make it possible to create a visual index (the base map and the map browser, described above) for patrons to search the library for information about a particular place. In general terms, metadata describe the content, quality, condition, and other characteristics of data. The major uses of metadata include (1) managing and maintaining an organization's investment in data, (2) providing information to data catalogs and clearinghouses, (3) providing information to aid data transfer and use, and (4) providing information on the data's history or lineage.

OCR for page 53
Although the second use is essentially the function performed by the traditional library catalog, it is clear that the functions of metadata in the distributed geolibrary extend well beyond this (FGDC, see www.fgdc.gov). Under (3), metadata provide the essential information necessary to allow a data set from some distant archive to be recognized and opened at the user's site. In general, geoinformation data sets are not interoperable in this way, especially if the archive and the user have adopted different geographic information systems (GIS). Problems of interoperation between GIS are addressed by Goodchild et al. (1998), and much recent work by the GIS industry has gone into improvements in interoperability in GIS, through the efforts of the Open GIS Consortium (www.opengis.org). Note, however, that the problems of distributed geolibraries in this area go well beyond those of GIS interoperability because distributed geolibraries are not just limited to geospatial data. In the context of distributed geographic information services, metadata include information that supports the exchange of processing operations between client and server (Open GIS Consortium, see www.opengis.org ). To date, little research has reported on formalization of such metadata to describe distributed geographic information services, though Tsou and Buttenfield (1998) showed that they should include two major parts: system metadata and data operation requirements. The system metadata describe methods and behaviors for system controls and program specifications, whereas data-operation requirements specify the requirements for data input to, and output from, specified operations. Collection Creation and Maintenance A range of tools are needed to support the creation and publication of geoinformation. Most new geospatial data are either published in digital form or go through a digital stage during production. But the predigital legacy of geospatial data is largely in the form of paper maps and photographic images, which must be laboriously digitized or scanned to be suitable for distributed geolibraries. Although massive investments have been made in

OCR for page 53
recent years, by such organizations as the Library of Congress, which has made much of its historical map collection available over the WWW, it is doubtful that the vast majority of the larger legacy residing in scattered collections and archives will ever be digitized because anticipated levels of use of most individual items cannot justify the cost. The nation currently possesses vast stores of data about or associated with geographic locations but for which no locational footprint is readily available. These stores include large archives of information on health, the economy, social conditions, and demographics, broken down in some cases to very fine levels of geographic detail. Such data could be incorporated into distributed geolibraries, and place could provide a very effective search mechanism, particularly when such data need to be integrated with other geoinformation. A coordinated plan is needed to link as much of this information as possible to geographic location. For example, use of census data could be considerably enhanced if the names and extents of its reporting zones (census tracts, counties, metropolitan areas) could be organized in gazetteer form for use in distributed geolibraries. Effective description of geoinformation can be difficult, and the FGDC's Content Standard for Digital Geospatial Metadata extends to several hundred fields. While federal agencies are mandated to create such metadata and have access to extensive resources, there is often little incentive for a local agency to create metadata for its own holdings. Many agencies have suggested simplifications of the FGDC standard; the Alexandria Digital Library (alexandria.ucsb.edu), for example, uses a subset of 35 fields to describe its holdings. Dublin Core is another effort to simplify the description of information using standard fields (purl.org/dc). The WWW makes it possible for virtually anyone to contribute information by creating and maintaining a WWW site. Distributed geolibraries could take great advantage of this potential by making it possible for users to double as providers of information, especially information that is the result of abstraction,

OCR for page 53
manipulation, interpretation, or synthesis of other information. For example, papers written based on distributed library resources could be contributed back to distributed geolibraries. The distinction being made here between raw data and derived knowledge is discussed at greater length in Chapter 3. Searching over Distributed Assets In a traditional library the catalog provides an index to the library's contents. In a distributed geolibrary the contents and the users are distributed, and five options can be identified for the catalog: 1.   A unified catalog exists in one place and can be searched by users. In this option each custodian of data submits metadata describing each available data set to a central site, where the records are assembled into a searchable database. Each record directs users to the appropriate location of the data set. For geoinformation, which tends to use specialized formats, this option requires the strongest central control and the highest level of cooperation from participating custodians. 2.   Each custodian of data assembles metadata describing each data set according to a standard, forming a distributed catalog. Users submit requests to a central site, and these are then automatically executed by search agents that examine each custodian's metadata. Performance of this solution degrades as the number of custodians increases. 3.   A collection-level catalog exists that identifies the general characteristics of each custodian's holdings and uses them to direct searches. For example, searches for data on some part of New York state might be directed to a custodian in Albany known to have a large collection of that state's data. The efficiency of this option depends on how precisely custodians' holdings can be differentiated. In effect it implements the kinds of expert knowledge that allow users to find data on the WWW in the absence of effective cataloging.

OCR for page 53
4.   Use a catalog built by a search service. Search services such as AltaVista and Yahoo build catalogs automatically by using intelligent agents or web crawlers, but they do so strictly on the basis of words found in text and are not effective ways of building a catalog for a distributed geolibrary. Nonetheless, it may be possible to build a new generation of specialized agents capable of recognizing geoinformation and extracting its important metadata descriptors. Such agents have been built on a prototype basis in the case of imagery; they successfully recognize the formats of imagery, open them, and compute such indices as shape, texture, and color for use in catalogs. 5.   No catalog exists. This reflects the situation on the WWW before WWW search services became available (and even today substantial parts of the WWW's resources remain unindexed by search services). Search for geoinformation without a catalog relies on the user's personal knowledge of the WWW's resources. Whereas a user of a research library can assume with some confidence that any research library will contain a copy of a major monograph or a popular journal, the principle of the WWW is almost exactly the opposite: a given item of information is most likely available at only one site. Search under these circumstances can be like looking for the proverbial needle in a haystack, with order 107 sites to search. In the case of geoinformation, the likelihood that a given item will be on a server increases with proximity to the item's footprint for several reasons: interest in the item is likely to be higher near or within the footprint; custodians in proximity to the footprint are more likely to have responsibility; and sponsorship of the data's collection and acquisition is more likely closer to the footprint. But the effect is likely to be weak and as such will provide an unreliable strategy for search. Integration, Analysis, and Manipulation Unlike books, which exist largely to be read, much geoinformation is raw in nature and is obtained for purposes that include detailed interpretation, analysis, and manipulation. A user requesting a remotely sensed image, for example, might submit it

OCR for page 53
to extensive operations that include correction for various known distortions, classification, and integration with other data obtained through similar processes. The end result of obtaining a Landsat image from an Internet site might be a statistical assessment of the amount of change that has occurred in an area over the past 10 years, following several months of detailed analysis and manipulation. Digital libraries differ from their traditional predecessors in the potential to support extensive manipulation of information once it has been retrieved. This manipulation might include: statistical correction for known distortions; tabulation to obtain statistical summaries; rubber sheeting to register geospatial data sets to known locations or to each other; format conversions, projection changes, and datum changes; use as input to complex environmental models for purposes of calibration or prediction; use in complex decision-making processes involving many stakeholders; or generalization, classification, interpretation, and other forms of information abstraction. Finding 8 A distributed geolibrary would allow users to specify a requirement, search across the resources of the Internet for suitable geoinformation, assess the fitness of that information for use, retrieve and integrate it with other information, and perform various forms of manipulation and analysis. A distributed geolibrary would thus integrate the functions of browsing the WWW with those of GIS and related technologies. Over the past three decades there has been enormous progress in the development and adoption of technologies for manipulating geoinformation, including GIS and image-processing systems. Today, most users of such systems rely heavily on the

OCR for page 53
ability to obtain input data from Internet resources, despite the lack of effective tools such as those envisioned for distributed geolibraries. Five steps characterize this gathering process: 1.   Specification of requirements, including coverage area, date, theme, level of detail, and other important characteristics. 2.   Search over known or likely sources, using a combination of personal knowledge and the limited capabilities of Internet search services. 3.   Assessment of the fitness for use of possible data sets, by comparing their documented characteristics with the specified requirements. 4.   Retrieval of suitable data sets. 5.   Opening of retrieved data sets on the user's system, including necessary changes of format and other steps needed to integrate data effectively. Many uses of geoinformation involve group activity—multidisciplinary research projects involving several investigators, planning projects involving several stakeholders and decision makers, group classroom projects involving several students. Distributed geolibraries should provide services to support such collaborative work (see Finding 7, Chapter 3). Many of the activities that could benefit from distributed geolibraries are best carried out away from the office desktop in the field. Emergency relief operations call for decisions that are best made in the presence of the emergency, where the emergency and its context can be observed directly. Access to distributed geolibraries could usefully augment the power of other field-based technologies, including the Global Positioning System, and mobile computing. Wireless connections could be used to search for and download information from distant servers and to upload new information gathered in the field.

OCR for page 53
Finding 9 Many important applications of distributed geolibraries are best located in the field, using portable systems and wireless communications. Delivery of services to the field is important in emergency management, agriculture, natural resource management, and many other applications. Assisting Users Although the demand can never be fully satisfied, libraries provide large amounts of assistance to their users, funded through library budgets. The Internet provides limited assistance, and users of the WWW are very much on their own, forced to rely on the limited assistance of online help, manuals, and other devices. If distributed geolibraries are to function as a more powerful evolution of the library model, effective ways must be found to help users navigate through their complexities and ambiguities. The problem is, if anything, more severe for geoinformation, which has always required a disproportionately high level of human assistance and user expertise. We have little experience with the problems that are likely to occur when inexperienced users begin to make widespread use of geoinformation. Problems posed by the important metadata variable level of geographic detail are discussed in this context by Goodchild and Proctor (1997), who conclude that new metaphors are needed to make it possible for general users to conceptualize their needs. For example, the metaphor of height of viewpoint above the surface of the Earth (move higher for less detail, descend for more detail) can be readily understood and used by children. Assessment and Feedback Libraries also employ staff who listen to their users, another function that is difficult to replicate in the impersonal digital environment of the Internet. On the other hand, many new and exciting mechanisms for eliciting feedback have been developed on the WWW, and distributed geolibraries would do

OCR for page 53
well to exploit these. For example, each custodian site might invite comments on its geoinformation from users and make these remarks available to others. Extensive assessment will be needed of the designs of user interfaces, to evaluate whether they achieve the objectives of distributed geolibraries, before they are widely released and adopted. Such designs should evolve through procedures familiar in the field of human-computer interaction, including evaluation studies and interactive refinement. Options for the Delivery of Distributed Geolibrary Services Ideally, we see a distributed geolibrary functioning as a single homogeneous entity capable of responding to a single query from a user, just as AltaVista is capable of responding to a query about some combination of key words. In practice, however, a number of configurations are possible, combining aspects of the following extremes: 1.   One-stop shopping. One server provides a one-stop shopping service, perhaps to a limited user base via an Intranet or to a universal base via the Internet. Either the entire catalog is mounted on the server or a query to the server results in transparent access to a distributed catalog. Similarly, geoinformation resources are served either directly or transparently through automated access to distributed resources. The agency operating the central server also maintains it, develops and enforces standards and protocols, and directs future development. Several servers currently approximate this mode of operation over substantial thematic and geographic domains, including the USGS's EROS Data Center, and NASA's EOSDIS. This option works well in areas where the resources to create geoinformation come from a single source that can also fund dissemination. Problems arise when jurisdictions or thematic areas overlap significantly. For example, are data about the city of Atlanta more likely to be found in a server operated by the city, county, state, or

OCR for page 53
federal government, or by the United Nations? Are data about soils most likely to be found on a server operated by the U.S. Department of Agriculture or the USGS? 2.   Distributed responsibility. This option follows the example of the WWW, for which policies are established by volunteer grassroots organizations that recognize need, devise solutions, and make them freely available to the user community. Protocols and standards allow any individual or group to participate in distributed geolibraries, subject to very loosely defined constraints. Whereas this model approximates the WWW, it differs sharply from the mode of operation of the traditional library, with its substantial resources, gatekeepers, and quality control. The function of cataloging on the WWW, for example, which is approximated by the search services, exists because certain companies saw business opportunities in providing a service that was compatible with WWW standards and met an obvious need. Similarly, quality control in distributed geolibraries might be achieved not by a central gatekeeper authority but by independent groups analogous to the Good Housekeeping Institute that assess and certify geoinformation on a for-profit or nonprofit basis. Finding 10 There are several alternative architectures for distributed geolibraries, including a single enterprise sponsored by a well-resourced agency, analogous to a national library; a network of enterprises with their own sponsors, analogous to a network or federation of libraries; and a loose network held together by shared protocols, analogous to the WWW. Geolibrary services can be freely combined and used based on application needs. For geolibraries to operate in a distributed (client-server) computing environment, services and functions must operate on a network of servers and clients. The availability of services must take into account server characteristics, such as file sharing and application serving, and whether there is a "thin" or "thick" client. In networking terminology a thick client is

OCR for page 53
defined as having operations and calculations executed on the client, consistent in this context with the GISystems model. A "thin" client may require that selected functions run on the server, consistent with the GIServices model. Whether the client should be thick or thin will depend on the task and associated performance requirements. For example, it may be appropriate to use thick clients for map display services, allowing the patron to take over the many intuitive decisions of graphic design, layout, and so forth, as well as to accommodate whatever output devices are available. Still other functions may deliver services best by avoiding transmission of large amounts of repetitive data across a network. For map browsing and place name searches, a geolibrary might use a "hybrid" approach by storing the basemap and gazetteer on the client but leaving the catalog functions on the server. Basemap information is voluminous and not likely to change frequently, so rather than transmit it repeatedly from a server it may be more efficient to store it locally in a specialized hybrid browser. Suitable basemaps include digital topographic maps and also images of the Earth's surface. Additional detail can be provided by digital elevation data, so the basemap provides a close resemblance to the actual surface of the Earth. "The role of client and server components should be dynamic and changeable. The balance of functionality between client services and server components will be a critical issue for the success of . . . distributed systems" (Tsou and Buttenfield, 1998).