ter computing services that go beyond simply accessing the Web on more devices. Such devices could act as useful sensors and provide a rich set of data about their environment that could be useful once aggregated for real-time disaster response, traffic-congestion relief, and as-yet-unimagined applications. An early example of the potential use of such systems is illustrated in a recent experiment conducted by the University of California, Berkeley, and Nokia in which cell phones equipped with GPS units were used to provide data for a highway-conditions service.31

More generally, the unabated growth in digital data, although still a challenge for managing and sifting, has now reached a data volume large enough in many cases to have radical computing implications.32 Such huge amounts of data will be especially useful for a class of problems that have so far defied analytic formulation and been reliant on a statistical data-driven approach. In the past, because of insufficiently large datasets, the problems have had to rely on various, sometimes questionable heuristics. Now, the digital-data volume for many of the problems has reached a level sufficient to revert to statistical approaches. Using statistical approaches for this class of problems presents an unprecedented opportunity in the history of computing: the intersection of massive data with massive computational capability.

In addition to the possibility of solving problems that have heretofore been intractable, the massive amounts of data that are increasingly available for analysis by small and large businesses offer the opportunity to develop new products and services based on that analysis. Services can be envisioned that automate the analysis itself so that the businesses do not have to climb this learning curve. The machine-learning community has many ideas for quasi-intelligent automated agents that can roam the Web and assemble a much more thorough status of any topic at a much deeper level than a human has time or patience to acquire. Automated inferences can be drawn that show connections that have heretofore been unearthed only by very talented and experienced humans.

On top of the massive amounts of data being created daily and all that portends for computational needs, the combination of three elements has the potential to deliver a massive increase in real-time computational resources targeted toward end-user devices constrained by cost and power:


31See the University of California, Berkeley, press release about this experiment (Sarah Yang, 2008, Joint Nokia research project captures traffic data using GPS-enabled cell phones, Press Release, UC Berkeley News, February 8, 2008, available online at http://berkeley.edu/news/media/releases/2008/02/08_gps.shtml).

32Wired.com ran a piece in 2008 declaring “the end of science”: The Petabyte Age: Because more isn’t just more—more is different,” Wired.com, June 23, 2008, available online at http://www.wired.com/wired/issue/16-07.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement