As We May Work: An Approach Toward Collaboration on the
An approach is introduced for information sharing and retrieval on the national information infrastructure (NII) that focuses on the communications paths between collaborative organizations in a telecommunications network rather than upon the content of the information that is stored in distributed databases. Direct connections between domains of information are classified in the form of trails through the network where they can be retrieved and shared in natural language. An application of the approach in shared health care systems is discussed.
The ability to connect to global networks and
communicate with millions of people has made every user a publisher
… but just as important, it has made every user an editor,
deciding what's important and real. In this medium, you get the
filter after the broadcast.
Paul Saffo, Institute for the Future
Our technologies [will] become more a medium for
designing how we should work together, rather than merely designing
how we share our information.
Statement of Problem
Contemporary organization theories have suggested that interorganizational networks developed to coordinate activities across organizational boundaries will become the "new institution" of the future. Not only will these networks be important mechanisms for providing superior economic performance and quality but they will also survive, largely because of their "adaptive efficiencies"that is, because they have the ability to adjust more rapibly to changing technology and market conditions, to produce more creative solutions, and to develop new products and services in a shorter period of time (Alter and Hage, 1993).
Public institutions such as schools, hospitals, libraries, social service organizations and state and local governments are also beginning to work together to provide their own services more efficiently, and they view the NII as an important tool for enhancing their efforts. These institutions will serve as catalysts for further developing the NII and will ultimately create the demands for the private sector's vision of an information superhighway that offers more practical services that address a growing and demographically shifting population.
There are, however, formidable barriers to the deployment of the technologies used in collaborative networks. The recent White House Mini-Conference on Aging highlighted a major problem when it pointed out the need for "a standard protocol linking national, local, 'on-line,' off-line, public, non-profit and private databases" in delivering services to the elderly. "Differing classification schemes, confusing terminology, and lack of 'info-glut' screening mechanisms" are limiting access to information and preventing the effective delivery of integrated care (''Accessing Eldercare via the Information Highway," 1995).
The vision of linkages between users of patient information within communities in which each health care facility and practitioner would connect to a network through an information system is greatly hindered by the inability to create, store, retrieve, transmit, and manipulate patients' health data in ways that best support decision making about their care. This is the problem that is addressed in this white paper. It is hoped that the approach presented here for information classification and retrieval through the NII will lead to further investigation of its potential.
Several efforts are already under way to promote the widespread use of advanced telecommunications and information technologies in the public and nonprofit sectors, especially at the community level. (See, for example, the U.S. Department of Commerce National Telecommunications Information Administration/TIIAP initiatives.) The private sector is also beginning to explore the use of information technology in community networks, including those designed to support and enhance collaboration among health and human services providers (Greene, 1995). Eventually, a system of "global, shared care" is expected to evolve in which the coordinated activities of different people from different institutions will apply different methods in different time frames, all in a combined effort to aid patients medically, psychologically, and socially in the most beneficial ways. Because the ability to move data is considered fundamental to the process of integrated care, attempts have been made to find cost-effective ways to share data among the participants. However, this approach has been fraught with difficulties that are largely unrelated to the ability of the technology to provide solutions. Questions of ownership, confidentiality, responsibility for health outcomes, and semantics are paramount, and clinicians are themselves calling for new solutions that do not require "knowledge" to be formalized, structured, and put into coding schemes (Malmberg, 1993).
The European Approach
Many Europeans have also recognized that one of the major problems in designing the shared care system is management of the communications process among the different institutions and health care professionals. They are taking a different approach and conducting field studies to evaluate the feasibility of using patient-owned, complete medical record cards, which patients would carry with them and present to the institution carrying out the treatment. Although they reconize the importance of natural language processing and the potential of optical-storage technology to reduce costs, they conclude that the technology will only be available within the respective information systems that contain medical records and that new solutions such as the chip card of the hybrid card must be found in order to extend communication to all health care providers (Ellsasser et al., 1995).
The Digital Library
Information sources accessed through the NII also represent components of emerging universally accessible, digital libraries. The National Science Foundation, in a joint initiative with the Advanced Research Projects Agency and the National Aeronautics and Space Administration, is supporting research and development designed to explore the full benefits of these libraries, focusing on achieving en economically feasible capability to both digitize existing and new information from heterogeneous and distributed sources of information and to find ways to store, search, process, and retrieve this information in a user-friendly way (National Science Foundation, 1994). It has been suggested, however, that "for digital libraries to succeed, we must abandon the traditional notion of 'library' altogether.… The digital library will be a collection of information services; producers of material will make it available, and consumers will find and use it" (Wilensky, 1995). New research is needed
that is fundamental to the development of advanced software and algorithms for searching, filtering, and summarizing large volumes of data, imagery, and all kinds of information in an environment in which users will be linked through interconnected communications networks without the benefit of preestablished criteria for arranging content.
Vannevar Bush and the New Technologies
The concept of a dynamic, user-oriented information system was introduced as early as 1945, when Vannevar Bush suggested that an individual's personal information storage and selection system could be based on direct connections between documents instead of the usual connections between index terms and documents. These direct connections were to be stored in the form of trails through the literature. Then at any future time the individual or a friend could retrieve this trail from document to document without the necessity of describing each document with a set of descriptors or tracing it down through a classification scheme (Bush, 1945).
In 1956, R.M. Fano suggested that a similar approach might prove useful to a general library and proposed that documents be grouped on the basis of use rather than content (Fano, 1956). This suggestion was followed 10 years later by a pioneering contribution of M.M. Kessler at the MIT Technical Information Project, who developed a criterion for such grouping of technical and scientific papers through "bibliographic coupling," in which two scientific papers cite one or more of the same papers (Kessler, 1965). This concept of bibliographic coupling has been extended to other types of coupling and refined to the present day, largely through computer-based techniques that identify sets of highly interrelated documents through "co-citation clustering" (Garfield, 1983).
Although it was recognized that the model of "trails of documents" as suggested by Dr. Bush 50 years ago had useful features that the subsequent partitioning models did not offer, research has not been conducted on its potential for classification and retrieval in modern communications networks. Perhaps this would be a good time to revisit the concept, especially as traditional computer-based systems are merged with communications systems in a network of networks such as the NII. And because citation characteristics are an indication of how scientific doctrine is "built," we might want to combine the idea of trails of documents (represented as "communications paths") with sets of documents (represented as "domains of information") into a more general model that can be used for both classification and retrieval of information. Such a model has been developed for military "command and control'' and is presented here for further consideration by the NII community.
Message traffic among higher-echelon commands during the early part of a crisis situation is extremely difficult to classify. This is because such communications do not generally fall into categories that deal with specific predetermined military tasks, but instead are much less precisely defined, less routine, and consist primarily of the exchanges of information along with recommendations, advice, and other messages that are necessary before any tactical systems can be put into effect. By the same token, these communications are difficult to retrieve in any formatted sense because the unexpected, evolving, and interdependent nature of the information places an even greater emphasis upon natural language communication.
In an attempt to avoid the inadequacies inherent in any classification system while at the same time recognizing that as the amount of available information grew there was a parallel need for a more precise way to retrieve specific data, a technique was developed for associating messages with each other that required no interpretation of the subject content of the messages (Greene, 1967). This technique is based upon the thesis that if a message referenced a previous message, the previous message must have influenced that message in some way. For example, a message might say, "This is in answer to your question in reference A." Often a message referenced a previous message that referenced a yet earlier message. Still other connections of messages through their references are possible.
In Figure 1, if each number represents a message and if an arrow from 2 to 1 means 2 referenced 1, then we can interpret Figure 1 as follows: message 2 references message 1, message 4 references message 2 but also references message 3, and message 5 is another message that references message 3. Thus, we can speak of a "reference-connected" set of messages S = (1, 2, 3, 4, 5)that is, a set of messages that are connected in any way through their references. (This concept is analogous to the one of "joining" in directed graph theory.)
It is noted that in Figure 1, messages 4 and 5 are "bibliographically coupled." Another type of coupling occurs if two papers are cited by one or more of the same papers (e.g., 2 and 3). And finally, there is the simple citation relationship between 1 and 2, 2 and 4, 3 and 4, and 3 and 5. These three basic types of reference connectivity have been used as separate partitioning criteria for retrieval systems in the past. However, they have not been combined into a single dynamic system for both classification and retrieval, nor have they been used to link databases for interorganizational collaboration, as this white paper suggests.
It was found that during the early part of a crisis situation when messages throughout the command structure and in different locations were put into reference-connected sets, these sets in most cases uniquely identified particular events during the crisis. For example, one set that was constructed from crisis-related message traffic found in files at three command headquarters contained 105 distinct messages that dealt with the preparations for landing airborne troops. Other sets of messages represented the communications related to other events such as the provision of medical supplies, the preparation of evacuation lists, and sending surgical teams. All of these events were represented by unique message sets in the investigated files of crisis-related traffic.
Reference-connected sets proved to be valuable tools in analyses of command information flow as well as of the operations they describe. Deficiencies in flows and use of information were much more easily identified when focus was placed upon a specific event represented by communications throughout an entire command structure. The natural application of these sets to information retrieval was also noted because it was possible to file messages automatically into appropriate message sets by noting only the references that were given. These sets then represented events during a crisis and were available for answering queries regarding their status. Predetermined subject categories were not required, nor were any restrictions placed upon the format of the messages. The method simply provided a way of quickly locating a message that had the information (as it was expressed in natural language) that was necessary to make a decision.
A simple filing method was used in the analysis for automatically classifying messages into reference-connected sets. If a message referenced a previous message, it was put into the file of the previous message. So, for example, in Figure 2, message 2 would be filed with message 1 because it referenced message 1. Message 3 does not reference a previous message and would thus begin a new file 2. However, message 4 referenced messages in both files and therefore connected the two. Two subsets were identified in this way. One subset (assigned the number 1) contained messages 1, 2, and 4. The other subset (assigned the number 2) contained messages 3, 4, and 5. Message 4 is the link between them and, in the language of directed graph theory, may be considered to be a linking point between two maximal paths in the semipath from message 1 to message 5.
The structure of a reference-connected set identifies subsets (as in the preceding section) that can be interpreted in a number of ways. For example, it is noted that a subset will occur only if there is a message within the set (such as 3 in Figure 2) that does not reference a previous message but that is eventually linked to the set. Such a message may begin a "new" event that eventually becomes related in some way to the earlier event initiated by message 1. However, the structure of a reference-connected message set is also a function of another important factorthe organizational chain-of-command and the distribution of information throughout this chain. For a message cannot reference a previous message unless its author is cognizant of the previous message. Consequently, the paths in a reference-connected set (and thus the corresponding subsets) will often reflect the information flow between specific commands although the event is essentially the same.
It is easily seen that this model can be extended and adapted to other interorganizational networks in which information is exchanged to meet a common goal, such as provision of health care. The application of the model also becomes more complex as additional nodes are included and multiple addressees are allowed. Nevertheless, two important characteristics should be noted that illustrate this model's potential in supporting the collaborative process:
Conclusion and Recommendations
In a medium in which "the filter comes after the broadcast" and in which users everywhere have direct access to the full contents of all available material, finding information will be a key problem. How can a classification system be developed for a communications-based system in which the unexpected, evolving, and interdependent nature of the information places even greater emphasis on natural language? New approaches will have to be found that avoid both the problem of describing the content of information and the problem of integrating new information into a predetermined classification code. The collaborative networks of the future will focus on information flows. They will lead to dynamic user-oriented information retrieval systems that are
based on communications paths and direct connections between distributed information sources rather than upon technologies that mechanically or electronically select information from a store. New paradigms of interaction appropriate for multimedia distributed systems will be the focus of new technologies, and automated, intelligent search agents will be found that help consumers as well as providers to find and use what is important and real.
New technologies, combined with the concept of reference-connected sets, may offer another potential solution to the management of the communications process among different institutions in collaborative networks. Future research on community networks should be focused on the operational level rather than the administrative level by linking users of information from the "bottom up" and by searching through communications paths rather than through the content of the information that is stored in distributed databases. This would give communities an opportunity to assess the role of the NII without large investments in technology and would allow participating organizations to gain the economic benefits of the network only in so far as there is a need to collaborate.
An approach is presented here that does not attempt to guide users through the vast domains of information that will be available through the NII. Instead, it helps them to find quickly the others user within their community of interest that may have the information they are seeking. This approach could provide the protocol needed to link national, local, "on-line," off-line, public, nonprofit, and private databases for increased access to collaborative networks. It could also enable providers of health and human services to work together to aid patients medically, psychologically, and socially in the most beneficial ways. It is a tempting approach
"Accessing Eldercare via the Information Highway: Possibilities and Pitfalls," a 1995 White House Mini-Conference on Aging, March.
Alter, C., and J. Hage. 1993. Organizations Working Together. Sage
Bush, Vannevar. 1945. "As We May Think," Atlantic Monthly 176(1):101–108.
Ellsasser, K-H., Nkobi, J., and Kohler, C.O. 1995. "Distributing Databases: A Model for Global, Shared Care," Healthcare Informatics, January.
Fano, R.M. 1956. Documentation in Action, Chapter XIV-e, pp. 238–244, Reinhold Publishing Corporation, New York.
Garfield, E. 1983. Citation IndexingIts Theory and Application in Science, Technology, and Humanities. ISI Press, Philadelphia.
Greene, M.J. 1967. "A Reference-Connecting Technique for Automatic Information Classification and Retrieval," Research Contribution No. 77, Operations Evaluation Group, Center for Naval Analyses, The Franklin Institute, March.
Greene, M.J. 1995. "Assessing the Effectiveness of Community Services Networks in the Delivery of Health and Human Services: An Economic Analysis Model," research conducted under HRSA Contract No. 94-544 (P), March.
Kessler, M.M. 1965. "Bibliographic Coupling Between Scientific Papers," American Documentation 14(1):10–25.
Malmberg, Carl. 1993. "The Role of Telematics in Improving the Links Between Primary Health Care Providers," Annual Symposium on Computer Applications in Medical Care.
National Science Foundation, Digital Library Initiative, FY 1994.
U.S. Department of Health and Human Services. 1993. Toward a National Health Information Infrastructure, report of the Work Group on Computerization of Patient Records, April.
Wilensky, R. 1995. "UC Berkeley's Digital Library Project," Communications of the ACM.