Building such a system is made more difficult by many real-world factors that are well beyond the state of the art. For example, integration often needs to be done quickly between systems that were not designed to interoperate, such as those of partners in an ad hoc coalition. Efficient techniques for evaluating complex subscriptions may enable warfighters to identify targets more quickly, while they are still vulnerable to attack. Sensors may be moving and unreliable, so that integration needs to be more dynamic and flexible. As improvements are made in power sources for sensors and networking technology and the cost and size of sensors decrease, we can expect the number of sensors and the data rate per sensor to increase. The combination of these effects will greatly increase the overall data rate from sensors, seriously taxing the scalability of current stream-processing algorithms. And, inaccuracies in human sources of information must be factored in.

The Air Force also needs advanced data integration technology in support of nonkinetic operations—that is, those that do not rely on damage to physical objects. For example, public relations challenges might require quickly searching and correlating information across many data sources, only some of which may be owned by the Air Force and many of which have not been integrated at the time the search request is issued. Influence operations might rely on very fast unearthing of information about particular individuals (e.g., a leader of a threatening group) or groups (e.g., a tribe, clan, or religious community). Other nonkinetic operations might depend on quick searches of blueprints, building permits, inventories, cybercrime logs, and so on.

Commercial database and distributed systems technology can be expected to meet many Air Force needs. However, for many technical areas, the Air Force places a higher value than the commercial sector on obtaining highly innovative solutions or on obtaining them well before their widespread commercial availability. Because the Air Force uses many custom-developed application systems, off-the-shelf solutions are often insufficient, particularly for aspects of databases and distributed systems that enable interfacing with other systems: schema reconciliation, metadata, and so on. Rather, fundamental models of information integration are needed, models that can be tailored to unique Air Force requirements.

What is the current state of the art? Robust database system products exist to support the storage and management of structured information, and recent research is leading to versions of these products that can manage XML documents. Extract, transform, and load (ETL) tools are available to help clean and integrate data—for example, to store it in a data warehouse. Enterprise information integration (EII) tools are becoming available that enable querying heterogeneous databases, typically rela-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement