gency rooms and veterinary offices or purchases of nonprescription drugs in grocery stores, and integrating it with background information about the affected patient’s residence and job address. Prototype systems are already under development, including one that monitors real-time admissions to 17 emergency departments near Pittsburgh, to generate profiles of ER visits, and discern patterns of activity. If anomalous patterns emerge that may signify an outbreak of some new pathogen, system administrators can quickly alert health officials.

Many other opportunities exist for such computer-aided “evidence-based decision making.” For example, the monitoring of activity on computer networks might flag potential attempts to break through a firewall; or sensor networks attached to public buildings might flag patterns of activity within the building that suggest suspicious behavior. In these kinds of cases, because the data is voluminous and derives from a variety of sources, an unaided decision maker might have difficulty detecting subtle patterns.

As a general proposition, the development of tools that provide human analysts with assistance in doing their jobs has a higher payoff (at least in the short to medium term) than tools that perform most or all of the analyst’s job. This places a greater emphasis on approaches that use technology to quickly sift large volumes of data to flag potentially interesting data items for human attention (as opposed to approaches that rely on computers to make high-level inferences themselves in the absence of human involvement and judgment).

A final dimension of information fusion is nontechnical. That is, disparate institutional missions may well dictate against a sharing of information at all. Underlying successful information fusion efforts is a desire to share information—and it is impossible to fuse information belonging to two agencies if those two agencies do not communicate with each other. Establishing the desire to communicate among all levels at which relevant information could be shared may have a larger impact than the fusion that might occur due to advances in technology.

Data Mining

“Data mining” is the automatic machine-learning of general patterns from a large volume of specific cases. For example, given a set of known fraudulent and nonfraudulent credit-card transactions, the computer system may learn general patterns that can be used to flag future cases of possible fraud. Data mining has grown quickly in importance in the commercial world over the past decade, as a result of the increasing volume of machine-readable data, advances in statistical machine-learning algorithms for automatically analyzing these data, and improved networking that makes it feasible to integrate data from disparate sources. Decision-tree learning, neural-network learning, Bayesian-network learning, and logistic-regression-and-support vector machines are among the most widely used



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement