growth in computing performance and meet society’s expectations for IT. (See Box 5.1 for additional context regarding other aspects of computing research that should not be neglected while the push for parallelism in software takes place.)
The findings and results described in this report represent a serious set of challenges not only for the computing industry but also for the many sectors of society that depend on advances in IT and computation. The findings also pose challenges to U.S. competitiveness: a slowdown in the growth of computing performance will have global economic and political repercussions. The committee has developed a set of recommended actions aimed at addressing the challenges, but the fundamental power and energy constraints mean that even our best efforts may not offer a complete solution. This chapter presents the committee’s recommendations in two categories: research—the best science and engineering minds must be brought to bear; and practice—how we go about developing computer hardware and software today will form a foundation for future performance gains. Changes in education are also needed; the emerging generation of technical experts will need to understand quite different (and in some cases not yet developed) models of thinking about IT, computation, and software.
In light of the inevitable trend toward parallel architectures and emerging applications, one must ask whether existing applications are amenable algorithmically for decomposition on any parallel architecture. Algorithms based on context-dependent state machines are not easily amenable to parallel decomposition. Applications based on those algorithms have always been around and are likely to gain more importance as security needs grow. Even so, there is a large amount of throughput parallelism in these applications, in that many such tasks usually need to be processed simultaneously by a data center.
At the other extreme, there are applications that have obvious parallelism to exploit. The abundance of parallelism in a vast majority of those underlying algorithms is data-level parallelism. One simple example of data-level parallelism for mass applications is found in two-dimensional (2D) and three-dimensional (3D) media-processing (image, signal, graphics, and so on), which has an abundance of primitives (such as blocks, triangles, and grids) that need to be processed simultaneously. Continuous growth in the size of input datasets (from the text-heavy Internet of the past to 2D-media-rich current Internet applications to emerging 3D