software execute faster on new hardware, without change, but also new applications exploited advances in graphics and rendering, digital signal processing and audio, networking and communications, cryptography and security—all made possible by hardware advances. Unfortunately, single-processor performance is now increasing at much lower rates—a situation that is not expected to change in the foreseeable future.
The causes for the declining rates of chip hardware performance improvements begin with the limit on chip power consumption, which is proportional to the product of the chip clock frequency and the square of the chip operating voltage. As chip clock frequencies rose from megahertz to gigahertz, chip vendors improved fabrication processes and reduced chip operating voltages and, thus, power consumption.
However, it is no longer practical to increase performance via higher clock rates, due to power and heat dissipation constraints. These constraints are themselves manifestations of more fundamental challenges in materials science and semiconductor physics at increasingly small feature sizes. While the market for the highest performance server processor chips continues to grow, the market demand for phones, tablets, and netbooks has also increased emphasis on low-power, energy-efficient processors that maximize battery lifetime.
Finally, the use of additional transistors to preserve the sequential instruction execution model while accelerating instruction execution reached the point of diminishing returns. Indeed, most of the architectural ideas that were once found only in exotic supercomputers (e.g., deep pipelines, multiple instruction issue, out-of-order instruction logic, branch prediction, data and instruction prefetching) are commonplace within microprocessors.
The combination of these challenges—power limitations, diminishing architecture returns, and semiconductor physics challenges—drove a shift to multicore processors (i.e., placing multiple processors, sometimes of differing power or performance and function, on a single chip). By making parallelism visible to the software, this technological shift disrupted the cycle of sequential performance improvements and software evolution atop a standard hardware base.
Beginning with homogeneous multicore chips (i.e., multiple copies of the same processor core), design alternatives are evolving rapidly, driven by the twin constraints of energy efficiency and high performance. In addition, system-on-a-chip designs are combining heterogeneous hardware functions used in smartphones, tablets, and other devices. The result is a dizzying variety of parallel functionality on each chip. It is likely that even more heterogeneity will arise from expanded use of accelerators and reconfigurable logic for increased performance while simultaneously meeting power constraints.
Whether homogeneous or heterogeneous, these chips are dependent on parallel software for operation, for there is no known alternative to parallel programming for sustaining growth in computing performance. However, unlike in the sequential case, there is no universally accepted, compelling programming paradigm for parallel computing. Absent such programming models and tools, creating increasingly sophisticated applications that fully and effectively exploit parallel chips is difficult at best. Thus, there exists a great opportunity and need for renewed research on parallel algorithms and programming methodologies, recognizing that this is a challenge and long-studied problem. However, because multicore chips are dependent on parallel programming, it is prudent to continue such explorations.
Although further research in parallel programming models and tools may ameliorate this problem (e.g., via domain-specific languages, high-level libraries, and toolkits), 40 years of research in parallel computing suggests this outcome is by no means certain. When combined with the need for increasingly rapid development cycles to respond to changing demands and the rising importance of software security and resilience in an Internet-connected world, the programming challenges are daunting. In combination, the continued slowing of processor performance and the uncertainty of a parallel software future poses potential short- and long-term risks for U.S. national security and the U.S. economy. This report focuses on the competitive position of the U.S. semiconductor and software industries and their impact on U.S. national security in the new norm of parallel computing.
Global Competition and the Research Landscape
Because of this disruption to the computing ecosystem,2 major innovations in semiconductor processes, computer architecture, and parallel programming tools and techniques are all needed if we are to continue to deliver ever-greater application performance.
2The advanced computing ecosystem refers not only to the benefits from and interdependencies between breakthroughs in academic and industry science and engineering research and commercialization success by national, multi-national and global companies, but also the underlying infrastructure (that includes components such as workforce; innovation models, e.g., centralist versus entrepreneurial; global knowledge networks; government leadership and investment; the interconnectedness of economies; and global markets) that underpin technological success.