2
Enabling Technologies

To understand the forces shaping networked systems of embedded computers it is useful to look at some of their underlying technologies—the devices used to compute, communicate, measure, and manipulate the physical world. The trends in these devices are what make EmNets such a compelling and interesting research question at this time. The current components are making large EmNets feasible now, and as these components continue to evolve, EmNets will soon become essential, even dominant, parts of both the national and global infrastructure.

Through the economics of silicon scaling, computation and communication are becoming inexpensive enough that if there is any value to be derived from including them in a product, that inclusion will probably happen. Unfortunately, while these “standard” components will enable and drive EmNets into the market, without careful research the characteristics that emerge from these collections of components may not always be desirable. EmNets present many new issues at both the component and system level that do not need to be (and have not been) addressed in other contexts.

This chapter provides a brief overview of the core technologies that EmNets use, the trends that are driving these technologies, and what new research areas would greatly accelerate the creation of EmNet-tailored components. Because the scaling of silicon technology is a major driver of computing and communication, this chapter starts by reviewing silicon scaling and then looks at how computing and communication devices



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers 2 Enabling Technologies To understand the forces shaping networked systems of embedded computers it is useful to look at some of their underlying technologies—the devices used to compute, communicate, measure, and manipulate the physical world. The trends in these devices are what make EmNets such a compelling and interesting research question at this time. The current components are making large EmNets feasible now, and as these components continue to evolve, EmNets will soon become essential, even dominant, parts of both the national and global infrastructure. Through the economics of silicon scaling, computation and communication are becoming inexpensive enough that if there is any value to be derived from including them in a product, that inclusion will probably happen. Unfortunately, while these “standard” components will enable and drive EmNets into the market, without careful research the characteristics that emerge from these collections of components may not always be desirable. EmNets present many new issues at both the component and system level that do not need to be (and have not been) addressed in other contexts. This chapter provides a brief overview of the core technologies that EmNets use, the trends that are driving these technologies, and what new research areas would greatly accelerate the creation of EmNet-tailored components. Because the scaling of silicon technology is a major driver of computing and communication, this chapter starts by reviewing silicon scaling and then looks at how computing and communication devices

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers take advantage of scaled technologies. In communications technology, attention is focused on wireless communications technology since this will be an essential part of many EmNets and on wireless geolocation technology since geographic location is a factor in many EmNets. The remaining sections review other components critical to EmNets, namely, the software systems that make EmNets work and MEMS, the new way to build low-cost sensors and actuators. Scattered throughout the chapter are boxes that provide more details on many of the technologies discussed. Readers who are already well versed in these subject areas or who are more interested in understanding the systems-level issues that arise in EmNets should move on to Chapter 3. SILICON SCALING Much of the driving force for the technological changes seen in recent years comes from the invention of integrated circuit technology. Using this technology, electronic components are “printed” on a piece of silicon, and over the years this process has been improved so that the printed components have become smaller and smaller. The ability to “scale” the technology simultaneously improves the performance of the components and decreases their cost, both at an exponential rate. This scaling has been taking place for over 40 years, giving rise to eight orders of magnitude change in the size and cost of a simple logic element, from chips with two transistors in the 1960s, to chips with 100 million transistors in 2001. Scaling not only decreases the cost of the devices, it also improves the performance of each device, with respect to both delay and the energy needed to switch the device. During this same 40 years, gates1 have become 1000 times faster, and the power required per gate has dropped more than 10,000-fold. This scaling is predicted to continue for at least another 10 to 20 years before it eventually reaches some fundamental technical and economic limit (Borkar, 1999). Silicon scaling continues to reduce the size, cost, and power and to improve the performance of electronic components. Reliability of the basic electronics has also improved significantly. Vacuum-tube electronics were limited by the poor reliability of the tubes themselves—filaments burned out regularly and interconnections were generally made by hand-soldering wires to sockets. Transistors were much more reliable due to cooler operation temperatures and the absence of filaments, but there were still huge numbers of soldered interconnects. As integrated circuits 1   A logic gate (“gate”) is the elementary building block of a digital circuit.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers have subsumed more and more functionality, they have also subsumed huge amounts of interconnections that are generally much more reliable than soldered pins on a printed circuit board. Coupling this manufacturing process to the notion of a computer has driven a huge industry. For example, mainframe computers that occupied rooms in the 1980s now can fit on a single chip and can operate faster and at much lower power than the older systems. The scaling of technology has not only enabled the building of smaller, faster computers, it has made computing so cheap that it is economical to embed computing inside devices that are not thought of as computers to increase their functionality. It is this rapidly decreasing cost curve that created and continues to expand a huge market for embedded computing, and as this same technology makes communication cheaper, it will allow the embedded computers to talk with each other and the outside world, driving the creation of EmNets. Just as electronic locks seem natural now (and soon it will be hard to imagine a world without them), it will soon seem natural for embedded systems inside devices that are not typically thought of as computers to communicate with each other. COMPUTING The ability to manufacture chips of increasing complexity creates a problem of its own: design cost. While design tools continue to improve, both the number of engineers needed to design a state-of-the-art chip and the cost of said chip continue to grow, although more slowly than chip complexity. These costs add to the growing expense of the initial tooling to produce a chip, mainly the cost of the masks (“negatives”) for the circuits to be printed—such masks now cost several hundred thousand dollars. Thus, chips are inexpensive only if they are produced in volumes large enough to amortize such large design costs. The need for large volumes poses an interesting dilemma for chip designers, since generally as a device becomes more complex, it also becomes more specialized. The most successful chips are those that, while complex, can still serve a large market. This conflict is not a new one and was of great concern at the dawn of the large-scale integration (LSI) era in the 1970s. The solution then was to create a very small computer, or microprocessor, and use it with memory to handle many tasks in software that had previously required custom integrated circuits. This approach really created embedded computing, since it provided the needed components for these systems. Over the years the microprocessor was an essential abstraction for the integrated circuit industry, allowing it to build increasingly complex components (processors and memory) that could be used for a wide variety of tasks. Over time, these processors have become faster, and they are

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers now the key component in all computers, from Internet-enabled cell phones to mainframe servers. The evolution of microprocessors over the past three decades has been unprecedented in the history of technology. While maintaining roughly the same user model of executing a sequential stream of instructions, these machines have absorbed virtually all of the extra complexity that process scaling provided them and converted it to increased performance. The first microprocessor was the Intel 4004, developed in 1971; it had 2300 transistors and ran at 200 kHz. A mere 30 years later, the Pentium 4 processor has almost 42 million transistors and runs at 1.7 GHz. Computer architects have leveraged the increased number of transistors into increased performance, increasing processor performance by over four orders of magnitude (see Box 2.1). Growing Complexity Increasing processor performance has come at a cost, in terms of both the design complexity of the machines and the power required by the current designs (on the order of 10 to 100 W). The growing complexity is troubling. When does the accumulating logical complexity being placed on modern integrated circuits cause enough errors in design to begin to drive overall system reliability back down? This is not a trivial concern in an era where volumes may be in the tens or hundreds of millions and failures may be life threatening. Another problem with the growing complexity is the growing cost to design these machines. New microarchitectures such as that for Intel’s Pentium 4 processor require a design team of several hundred people for several years, an up-front investment of hundreds of millions of dollars. Also of growing concern is the fact that continuing to scale processor performance has become increasingly difficult with time. It seems unlikely that it will be possible to continue to extract substantially more parallelism at the instruction level: The easy-to-reach parallelism has now been exploited (evidence of this can be seen in Figure 2.1), and the costs in hardware resources and implementation complexity are growing out of all proportion to additional performance gains. This means that the improvement in instructions per clock cycle will slow. Adding to that concern, it also seems unlikely that clock frequency will continue to scale at the current rate. Unless a breakthrough occurs in circuit design, it will become very difficult to decrease clock cycle times beyond basic gate speed improvements. Overall microprocessor performance will continue to grow, but the rate of improvement will decrease significantly in the near future.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.1 Communication Is Costly in Complex Designs The dominant technology used to build integrated circuits is complementary metal-oxide semiconductor (CMOS) technology. As the integrated circuit shrinks in size, the characteristics of the basic transistors improve—they speed up. Historically the speed of a basic CMOS gate has been roughly proportional to its size. This performance increase will continue, although various problems might slow the rate of improvement in the future (SIA, 1999). In addition to gates, the other key component on an integrated circuit is the wire that connects the gates. The scaling of wires is more complex than that of the gates and has led to some confusion about how the performance of circuits will scale in the future. As technology scales, the delay of a wire (the length of time it takes for a signal to propagate across the wire) of constant length will almost certainly increase. At first glance this seems like a huge problem, since gate delays and wire delays are moving in opposite directions. This divergence has led a number of people to speak of wire-limited performance. The key point is, as technology scales, a wire of a given length spans a larger number of gates than the wire in an older technology, since all the gates are smaller. A circuit that was simply scaled to the new technology would also shrink in length, since everything has shrunk in size. The amount of delay attributable to this scaled wire is actually less than that of the original wire, so wire delay decreases just as a gate does. While the wire delay does not scale down quite as fast as the gate, the difference is modest and should not be a large problem for designers. One way of viewing the wire delay is to realize that in any given technology the delay of a wire that spans more gates is larger than the delay of a wire that span fewer gates. Communicating across larger designs (that is, designs with more gates per unit area) is more expensive than communicating across smaller designs. Technology scaling enables larger designs to be built but does not remove the communication cost for these complex designs. So, scaling does not make wire performance proportionally worse per se; rather it enables a designer to build a more complex system on a chip. The large communication delays associated with systems are starting to appear on chips. These growing communication costs of today’s large complex chips are causing people to think about smaller, more partitioned designs, and they are one driver of simpler embedded computing systems. Simpler Processors Up to this point the focus has been on the highest performance processors, but technology scaling has also enabled much simpler processors to have more than sufficient performance.2 Rather than adding complex- 2   The words “simple” and “complex” are not used here as a shorthand reference to the Reduced Instruction Set Computing versus Complex Instruction Set Computing (RISC vs.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers FIGURE 2.1 Instructions executed per cycle. ity in order to wrest better performance from the chip, it is possible to use the added transistors for other functions, or not use them at all, making the chip smaller and cheaper and, as will be seen in the next section, less power consuming. It is these “simpler” processors that are used in most embedded systems, since they often do not need the highest performance. For many applications, the extra complexity can be and is used to interface to the outside world and to reduce the amount of off-chip memory that is needed to reduce the system cost. As technology scales, these simpler processors have gotten faster, even if the design does not use more transistors, simply because the gates have become faster. Often a slightly more complex architecture is used, since it is now cheap enough. This scaling trend in the embedded proces-     CISC) debates of the 1980s. They refer to the complexity of a computer’s microarchitecture and implementation, not its instruction set.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers sor space has dramatically increased the performance of the processors being deployed and will continue to do so (see Box 2.2). The fastest embedded processors have a processing power that is within a factor of four of today’s desktop processors (e.g., an 800-MHz StrongArm processor compared with a 1.5-GHz Pentium 4), but most embedded processors have performance that is an order of magnitude worse. With increased processing power comes the ability to build more sophisticated software systems with enough cycles to support various communication protocols. The existence of very cheap cycles that can support richer environments is another factor pushing EmNets into existence. Power Dissipation Power dissipation in general-purpose central processing units (CPUs) is a first-order constraint, requiring more expensive power supplies and more expensive cooling systems, making CPU packages more expensive; it may even affect the final form factor of the computer system.3 Power has always been constrained in embedded systems, because such systems typically cannot afford any of the remedies mentioned above. For example, the controller in a VCR cannot require a large power supply, cannot have a fan for cooling, and cannot make the VCR be taller than such products would otherwise be. There are two major strategies for taking advantage of the benefits of new processor technology: maximize performance or minimize power. For each new technology, the power needed to supply the same computation rate drops by a factor of three (see Box 2.3). The reason that general-purpose microprocessor power increases with each new generation is that performance is currently valued more than cost or power savings, so increased performance is preferred in the design process over decreased power requirements. As power has become more important in complementary metal-oxide semiconductor (CMOS) designs, designers have developed a number of techniques and tools to help them reduce the power required. Since in CMOS much of the power is used to transition the value on a wire, many of the techniques try hard to ensure a signal is not changed unless it really should be and to prevent other ways of wasting power. The power saving ranges from simply turning off the processor/system when the ma- 3   For example, microprocessors that dissipate too much heat may require very large fans or heat sinks for cooling. If that physical package is too large, it may be impossible to realize a server in a one-unit-high form factor, drastically reducing the modularity and scalability of the design.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.2 Microprocessor Program Performance While scaling technology allows the building of faster gates, it primarily allows the construction of designs that contain many more gates than in previous iterations. Processor designers have been able to convert larger transistor budgets into increased program performance. Early processors had so few transistors that function units were reused for many parts of the instruction execution.1 As a result it took multiple cycles for each instruction execution. As more transistors became available, it became possible to duplicate some key functional units, so each unit could be used for only one stage in the instruction execution. This allowed pipelining the machine and starting the next instruction execution before the previous one was finished. Even though each instruction took a number of cycles to complete execution, a new instruction could be started every cycle. (This sort of pipelining is analogous to a car wash. It is not necessary to wait until the car ahead exits the car wash before introducing a new car; it is only necessary to wait until it has cleared the initial rinse stage.) As scaling provided more transistors, even more functional units were added so machines could start executing two instructions in parallel. These machines were called superscalar to indicate that their microarchitectures were organized as multiple concurrent scalar pipelines. The problem with a superscalar machine is that it runs fast as long as the memory system can provide the data needed in a timely fashion and there are enough independent instructions to execute. In many programs neither of these requirements holds. To build a fast memory system, computer designers use caches2 to decrease the time to access frequently used data. While caches work well, some data will not be in the cache, and when that happens the machine must stall, waiting for the data to be accessed. A so-called out-of-order machine reduces this delay by tracking the actual data-flow dependency between instructions and allowing the instructions to execute out of program order. In other words, the chine is inactive, a technique that is used in almost all portable systems, to careful power control of individual components on the chip. In addition, power is very strongly related to the performance of the circuit. A circuit can almost always be designed to require less energy to complete a task if given more time to complete it. This recently led to a set of techniques to dynamically control the performance as little as necessary to minimize the power used.4 Two recent examples of this are the Transmeta Crusoe processor (Geppert and Perry, 2000) and the Intel Xscale processor (Clark et al., 2001). 4   See DARPA’s Power Aware Computing/Communication Program for more information on work related to this problem. Available at <http://www.darpa.mil/ito/research/pacc/>.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers machine finds other work to do while waiting for slow memory elements. While much more complex than a simple superscalar machine, out-of-order processing does expose more parallelism and improves the performance of the processor. Each architectural step—pipelining, superscaling, out-of-order execution—improves the machine performance roughly 1.4-fold, part of the overall threefold performance improvement. Figure 2.1 plots a number proportional to the number of instructions executed each cycle for six generations of Intel processors. The data clearly show that increasing processor complexity has improved performance. Figure 2.2 gives the clock rate of these same processors; it shows a roughly two-fold increase in frequency for each generation. Since a scaled technology comes out roughly every 3 years, 1.4 of the overall performance increase comes from this improvement in speed. The remaining factor of 1.4, which comes from improvements in the circuit design and microarchitecture of the machine, is illustrated in Figure 2.3. This shows how many gates one can fit in each cycle and how this number has been falling exponentially, from over 100 in the early 1980s to around 16 in the year 2000. The decrease has been driven by using more transistors to build faster function units and by building more deeply pipelined machines. Multiplying these three factors of 1.4 together yields the threefold processor performance improvement observed. It should be noted that recent designs, such as the Pentium III and Pentium 4 chips, have not been able to achieve the increases in parallelism (instructions per cycle) that contributed to the threefold increase. This provides some concrete evidence that uniprocessor performance scaling is starting to slow down. 1   An adder, for example, might have been used to generate the instruction address and then reused to do the operation or generate the data address. 2   In this instance, a cache is a temporary storage place for data on the chip that allows much faster retrieval than accessing the data in memory. The drive for low power causes a dilemma. (See Box 2.4 for a discussion of micropower sources for small devices.) While processor-based solutions provide the greatest flexibility for application development, custom hardware is generally much more power efficient. Early work in low-power design by Brodersen et al. (1992) and others showed that for many applications, custom solutions could be orders of magnitude lower in power requirements than a general-purpose processor. This is unfortunate, since the economics of chip production, as described earlier, make it unlikely that most applications could afford to design custom chips unless the design process becomes much cheaper. There are a couple of clear reasons why custom chips need less power. Their main advantage is that they are able to exploit the parallelism in the application. While exploiting parallelism is usually considered a way to increase perfor-

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers FIGURE 2.2 Clock rate of various processors. FIGURE 2.3 Gates per cycle.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers mance, since performance and power are related, one can take higher-performance systems and make them lower power. In addition to parallelism, custom solutions have lower overheads in executing each function they perform. Since the function is often hard wired, there is no need to spend energy to specify the function. This is in contrast to a processor that spends a large amount of its power figuring out what function to perform—that is, determining what instructions to fetch and fetching them (see Gonzalez and Horowitz, 1996). As mentioned earlier, the downside of these custom solutions is their complexity and the cost of providing a new solution for each application. This conflict between good power-efficiency and flexibility leads to a number of interesting research questions about how to build the more general, power-efficient hardware that will be needed for EmNets. Some researchers are trying to generalize a custom approach,5 while others are trying to make a general-purpose parallel solution more power efficient.6 The best way to approach this problem is still an open question. COMMUNICATION As discussed earlier, it is very clear that silicon scaling has made computation very cheap. These changes in technology have also driven the cost of communication down for both wireline and wireless systems. The continued scaling of CMOS technology enables cheap signal processing and low-cost radio frequency circuits. This has been evident in the past several years with the rapid expansion of wireless networking technology, first into the workplace and now into the home (e.g., wireless Ethernet and Apple Airport), which permits laptops and tablets to have a locally mobile high-speed network connection. As the technology improves, more sophisticated coding and detection algorithms can be used, which either decrease the power or increase the bandwidth of the communication. Soon it will be possible to place a low-cost wireless transceiver on every system built, a development that would seem to make it inevitable that these embedded systems will be networked. One constraint is that while bandwidth is increasing and cost is decreasing, the power demands are not becoming significantly lower. Communication 5   See, for example, the work being done at the Berkeley Wireless Research Center, available at <http://bwrc.eecs.berkeley.edu/> or at the company Tensilica, <http://tensilica.com/>. 6   See, for example, the work being done at the Stanford Smart Memories Project, available at <http://www-vlsi.stanford.edu/smart_memories/> or at the company ARC, <http://www.arccores.com/>.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers and remains a challenge, as new functionality must be added without adversely affecting response. A real-time operating system must enable applications to respond to stimuli in a deterministic amount of time, known as the latency. The actual amount of time is dependent on the application, but the determinism requirement is nonnegotiable. All design decisions in the operating system must therefore optimize system latency. This stands in contrast to most desktop and server operating systems, which are optimized for throughput and for protection of multiple processes, with latency far less important. Critical design decisions as basic as system data structures (queues, tables, etc.), memory protection and paging models, and calling semantics are driven by these very different optimization requirements, making it difficult or impossible to “add” real time to an operating system that was not designed from the beginning with that as a core requirement. Like any modern operating system, most real-time embedded operating systems are multitasking. Unlike most desktop and server operating systems, however, embedded operating systems are split between those systems in which there are multiple processes, each residing in its own memory, and those in which all tasks live in the same memory map, with or without protection from one another. Furthermore, new systems are beginning to appear based on entirely different memory protection models, such as protection domains. Some of the issues that arise in embedded systems with respect to memory management, tasks, and scheduling are described in Box 2.14. MICROELECTROMECHANICAL SYSTEMS Microelectromechanical systems, or MEMS, had their start in a famous talk by the physicist Richard Feynman entitled “There’s Plenty of Room at the Bottom” (Feynman, 1960; Trimmer, 1997.) Feynman pointed out that tremendous improvements in speed and energy requirements, as well as in device quality and reliability, could be had if computing devices could be constructed at the atomic level. MEMS represent the first steps toward that vision, using the best implementation technology currently available: the same silicon fabrication that is used for integrated circuits. MEMS devices generally attempt to use mechanical properties of the device, in conjunction with electronic sensing, processing, and control, to achieve real-world physical sensing and actuation. The accelerometers in modern cars with airbags are MEMS devices; they use tiny cantilever beams as the inertial elements and embody the extreme reliability required of such an application. Other MEMS devices take advantage of the wave nature of light, incorporating regular patterns of very fine comb

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.11 Upgradability Traditionally, most embedded devices, once deployed, have rarely been upgraded, and then only very proactively and carefully, for instance by physically replacing read-only memory (ROM). In a world of networked embedded systems, and with rewritable, nonvolatile storage widely available, field upgrades will be more frequent and often far more invisible to end users of the systems.1 This will occur because EmNets may be in service for many years, and the environment to which they are connected and the functionality requirements for the device may change considerably over that time. In some cases, such upgrades are driven by a knowledgeable user, who purchases a new component of functionality and installs it, a nearly automatic procedure. In other cases, updates or upgrades may be invisible to the end user, such as when protocols or device addresses change. Devices like home gateways, automobiles, and appliances may be upgraded online without the consumer ever knowing about it and in ways well beyond the consumer’s understanding, raising the issue of usability and transparency to the user. Transparent software upgrade of deployed EmNets, while probably necessary and inevitable, presents a number of difficulties. The very fact that the upgrades are transparent to the end user raises troubling questions of who has control of the EmNet (the user or the upgrader?) and creates potential security and safety issues if such an upgrade is erroneous or malicious. What if the software is controllable or upgradable by parties that are not to be trusted? Further difficulty is caused by the heterogeneity of many EmNets. Many individual nodes may need to be upgraded, but those nodes may be based on different hardware and/or different operating systems. Deploying an upgrade that will work reliably across all these nodes and EmNets is a challenge closely related to the code mobility issues dis- structures, arranged to refract light in useful ways under mechanical control. A Texas Instruments MEMS device is the heart of a projector in which each pixel is the light bounced off one of millions of tiny mirrors, hinged such that the amounts of red, green, and blue light can be independently controlled. Microfluidics is an emerging MEMS application in which the fluid capillaries and valves are all directly implemented on a silicon chip and controlled via onboard electronics. Still other MEMS devices implement a membrane with a tunneling current sensor for extremely precise measurements of pressure. The combination of MEMS sensing plus the computation horsepower of embedded processors opens the way to large networks of distributed sensing plus local processing, with communication back to central synthesis engines for decision making. However, there are challenges to be overcome before MEMS can real-

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers cussed in Chapter 3. Finally, there may be simultaneity requirements—that is, all nodes in an EmNet, which may be widely dispersed geographically, may need to be upgraded at the same time. This requirement may need to be addressed by multistage commits, similar to those used in transaction processing. Online update is largely an application issue rather than an operating system issue. However, most system designers will expect the operating system to make the task easier and to handle some difficult problems like upgrade policy, verification, and security. Furthermore, in some cases the operating system itself may need to be field upgraded, a process that almost certainly requires operating system cooperation and that extends beyond the device being updated. A server infrastructure is required to set policies, supply the correct information to the correct devices, manage security of the information, and verify correctness. This infrastructure is likely to be supplied by a few providers, akin to Internet Service Providers (ISPs) or Application Service Providers (ASPs), rather than to be created anew for each individual deployed product. As of 2001, there is no consensus on how online field upgrade will work for the billions of networked embedded systems components that will be deployed, nor is there any significant move toward applicable standards. Field upgrade is likely to become an important focus of research and development work over the next several years as numerous systems are deployed that challenge the ability of simple solutions to scale up to adequate numbers and reliability. 1   The problem of field upgradability of EmNet elements is similar to the problem encountered in downloading software for software-defined radios, which is being studied by a number of companies and the SDR (Software Defined Radio) Forum, a de facto standards organization. ize this promise. One is in the nature of real world sensing itself: It is an intrinsically messy business. A MEMS device that is attempting to detect certain gases in the atmosphere, for instance, will be exposed to many other gases and potential contaminants, perhaps over very long periods of time and with no maintenance. Such devices will have to be designed to be self-monitoring and, if possible, self-cleaning if they are to be used in very large numbers by nonexperts. The aspects of silicon technology that yield the best electronics are not generally those that yield the best MEMS devices. As has been discussed, smaller is better for electronics. Below a certain size, however, MEMS devices will not work well: A cantilever beam used for sensing acceleration is not necessarily improved by making it smaller. Yet to meet the low cost needed for large numbers of sensing/computing/reporting devices, the MEMS aspects and electronics will have to be fabricated onto the

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.12 High Availability and Fault Tolerance Many EmNets must work continuously, regardless of hardware faults (within defined limits) or ongoing hardware and software maintenance, such as hardware or software component replacement. Reliability in an unreliable and changeable environment is usually referred to as high availability and fault tolerance (HA/FT). HA/FT may require specialized hardware, such as redundant processors or storage. The operating system plays a key role in HA/FT, including fault detection, recovery, and management; checkpoint and fail-over mechanisms; and hot-swap capability for both hardware and software. Applications also need to be designed with HA/FT in mind. A layer between the application and the operating system that checks the health of the system and diagnoses what is wrong can be used to control the interaction between the two. HA/FT systems have not been widely used; instead, they tend to have niches in which they are needed, such as banking, electric power, and aircraft. Those who need them, often communications equipment manufacturers, have built them in a proprietary fashion, generally for a specific product. The first portable, commercial embedded HA/FT operating systems, as well as reusable components for fault management and recovery, are just starting to become available,1 but they have not yet been widely deployed in a general-purpose context. EmNets will very likely be used in a variety of contexts, and transferring HA/FT capabilities to EmNets is a challenge the community must meet. 1   As examples, see Wind River’s VxWorks AE at <http://www.windriver.com/products/html/vxworksae.html>, Enea’s OSE Systems at <http://www.enea.com/>, and LynuxWorks at <http://www.lynuxworks.com/>. same silicon. Much work remains to find useful MEMS sensors that can be economically realized on the same silicon as the electronics needed for control and communication. SUMMARY This chapter has provided a brief overview of the core technologies that EmNets will use, the trends that are driving these technologies, and the research areas that will accelerate the widespread implementation of EmNets. It has argued that silicon scaling, advances in computing hardware, software, and wireless communications, and new connections to the physical world such as geolocation and MEMS will be the technological building blocks of this new class of large-scale system. Large systems will comprise thousands or even millions of sensing,

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.13 Ability to Work with New Hardware Software needs hardware, and the nature of hardware is changing. For decades, the relationship between hardware and software has been well defined. Computer architectures, whether microprocessor or mainframe, have changed slowly, on a time scale of many years. Software has resided in random access memory (RAM) or read-only memory (ROM) and has been executed on an arithmetic logic unit (ALU) on the processor in the computer. New developments in the hardware world will challenge some of the assumptions about this relationship. Multicore processors—multiple concurrent processing elements on a single chip—are becoming economical and common. They often include a single control processor and several simpler microengines specifically designed for a task such as networking or signal processing. Thus, a microprocessor is no longer a single computer but is becoming a heterogeneous multiprocessing system. Configurable processors, created with tools from companies such as ARC and Tensilica, make it very easy for a user to craft a custom microprocessor for a specific application. These tools can create real performance advantages for some applications. Programmable logic chips are growing larger, with millions of gates becoming available; they are also available in combination chips, which include a standard CPU core and a significant number of programmable gates. These make it possible to create multiple, concurrent processing elements and reconfigure continuously to optimize processing tasks. All of these advances hold great promise for performance, cost, and power efficiency, but all create real challenges for software. Applications and operating systems must be able to perform well in reconfigurable, multiprocessing environments. New frameworks will be required to make efficient use of reconfigurable processing elements. Interestingly, all of these advances put compilers and programming languages back in the forefront of software development.1 1   For examples of this kind of work, see the Oxygen Project at MIT, <http://oxygen.lcs.mit.edu/>, and the Ptolemy Project at Berkeley, <http://ptolemy.eecs.berkeley.edu/>. computing, and actuating nodes. The basic trends are clear: These large, inexpensive, highly capable systems are becoming feasible because of the cumulative effects of silicon scaling—as ever-smaller silicon feature sizes become commercially available, more and more transistors can be applied to a task ever more cheaply, thus bringing increasingly capable applications within economic range. There are also some countervailing trends, in the form of constraints: Communication is costly, both on-chip and between chips; there are problems looming in the areas of power

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers BOX 2.14 Operating Systems and EmNets A multiprocess system uses virtual memory to create separate memory spaces in which processes may reside, protected from each other. A multitasking operating system usually implies that all tasks live in the same memory map, which comes with its own host of security implications. Since many embedded systems have no virtual memory map capability, these simpler systems are prevalent for many applications. A multitask system can also run much faster, since the operating system does not need to switch memory maps; this comes at the cost of less protection between running tasks, however. Those switches can make determinacy difficult, since all planning must take place around worst-case scenarios entailing significant swapping of page tables. A further concern is preemption. Preemption occurs when the system stops one task and starts another. The operating system must perform some housekeeping, including saving the preempted task’s state, restoring the new task’s states, and so on. The time it takes to move from one task to another is called the preemptive latency and is a critical real-time performance metric. Not all embedded operating systems are preemptive. Some are run-to-completion, which means that a task is never stopped by the operating system. This requires the tasks to cooperate, for instance by reaching a known stopping point and then determining whether other tasks need to run. Run-to-completion operating systems are very small, simple, and efficient, but because most of the scheduling and synchronization burden is pushed to the individual tasks, they are only applicable to very simple uses. Almost all embedded operating systems assign each task a priority, signifying its importance. In a preemptive system, the highest priority task that is ready is always running. These priorities may change for a number of reasons over time, either because a task changed a priority explicitly or because the operating system changes it implicitly in certain circumstances. The algorithms by which the operating system may change task priorities are critical to real-time performance, but they are beyond the scope of this study. Preemptive real-time embedded operating systems vary significantly in performance according to the various decisions made—both overt (multitask vs. multiprocess, number of priorities, and so on.) and covert (structure of the internal task queue, efficiency of the operating system’s code). Unfortunately, there are no standard benchmarks by which these systems are measured. Even commonly used metrics, such as preemptive latency, interrupt latency, or time to set a semaphore, can be very different because there is no universal agreement on precisely

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers what those terms mean. When the application is added to the system, the resulting behavior is very complex and can be difficult to characterize. It may be very difficult to understand how settable parameters, such as task priority, are affecting system behavior. There are a number of methodologies, however, that can help with these problems. Other considerations beyond real-time execution and memory management emerge in EmNets. Numerous efforts address the real-time executive aspects, but current real-time operating systems do not meet the needs of EmNets. Many such systems have followed the performance growth of the wallet-size device. Traditional real-time embedded operating systems include VxWorks, WinCE, PalmOS, and many others. Table 2.1, taken from Hill et al. (2000), shows the characteristics for a handful of these systems. Many are based on microkernels that allow for capabilities to be added or removed based on system needs. They provide an execution environment that is similar to that of traditional desktop systems. They allow system programmers to reuse existing code and multiprogramming techniques. Some provide memory protection, as discussed above, given the appropriate hardware support. This becomes increasingly important as the size of the embedded applications grows. These systems are a popular choice for PDAs, cell phones, and television set-top boxes. However, they do not yet meet the requirements of EmNets; they are more suited to the world of embedded PCs, requiring a significant number of cycles for context switching and having a memory footprint on the order of hundreds of kilobytes.1 There is also a collection of smaller real-time systems, including Creem, pOSEK, and Ariel, which are minimal operating systems designed for deeply embedded systems, such as motor controllers or microwave ovens. While providing support for preemptive tasks, they have severely constrained execution and storage models. POSEK, for example, provides a task-based execution model that is statically configured to meet the requirements of a specific application. However, they tend to be control-centric—controlling access to hardware resources—as opposed to data-flow-centric. Berkeley’s TinyOS2 is focused on satisfying the needs of EmNets. Additional research and experimentation are needed to develop operating systems that fit the unique constraints of EmNets. 1   Unfortunately, while there is a large amount of information on code size of embedded operating systems, very few hard performance numbers have been published. 2   For more information, see <http://tinyos.millennium.berkeley.edu/>.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers TABLE 2.1 Characteristics of Some Real-time Embedded Operating Systems Name Preemption Protection ROM Size Configurability Targetsa POSEK Tasks No 2K Static Microcontroller PSOSystem POSIX Optional   Dynamic PII → ARM Thumb VxWorks POSIX Yes ~286K Dynamic Pentium → Strong ARM QNX Neutrino POSIX Yes >100K Dynamic Pentium II → NEC chips QNX Real-time POSIX Yes 100K Dynamic Pentium II → 386s OS-9 Process Yes   Dynamic Pentium → SH4 Chorus OS POSIX Optional 10K Dynamic Pentium → Strong ARM Ariel Tasks No 19K Static SH2, ARM Thumb CREEM Data flow No 560 bytes Static ATMEL 8051 aThe arrows in this column are used to indicate the range of capabilities of the targets.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers dissipation, battery life, and design complexity; and many of the areas known to be problematic for today’s systems are likely to be substantially more problematic with EmNets. Networking solutions that work well enough for today’s systems are based on many assumptions that are inappropriate for EmNets. For instance, the potentially huge number of nodes, the ad hoc system extensions expected, the extended longevity, and the heavy reliance on wireless communications between nodes will collectively invalidate some basic assumptions built into today’s network solutions. Increased needs for system dependability will accompany the use of EmNets for real-time monitoring and actuating, but existing software creation and verification techniques will not easily or automatically apply. Other EmNet requirements, such as the need for software upgradability and fault tolerance,will also require great improvements in the state of the art. Other technological enablers for EmNets will be MEMS and better power sources. MEMS devices show great promise for real-world sensing (temperature, pressure, chemicals, acoustical levels, light and radiation, etc.). They also may become important for real-world actuation. EmNet nodes will be heterogeneous. Some will be as powerful as any server and will have more than sufficient power. But system nodes that are deployed into the real world will necessarily rely on very careful energy management for their power. Advances in power management will provide part of the solution; advances in the energy sources themselves will provide the other part. Improved batteries, better recharging techniques, fuel cells, microcombustion engines, and energy scavenging may all be important avenues. Predicting the future of a field moving as rapidly as information technology is a very risky proposition. But within that field, certain trends are unmistakable: basic silicon scaling and the economics surrounding the semiconductor/microprocessor industry, power sources, and software. Some of these trends will seem almost inevitable, given the past 20 years of progress; others will require new work if they are not to impede the overall progress of this emerging technology. REFERENCES Borkar, S. 1999. “Design challenges of technology scaling.” IEEE Micro 19(4):23-29. Brodersen, R.W., A.P. Chandrakasan, and S. Cheng. 1992. “Lowpower CMOS digital design.” IEEE Journal of Solid-State Circuits 27(4):473-484. Chew, W.C. 1990. Waves and Fields in Inhomogeneous Media. New York, N.Y.: Van Nostrand Reinhold. Clark, L., et al. 2001. “A scalable performance 32b microprocessor.” IEEE International Solid-State Circuits Conference Digest of Technical Papers, February.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers Computer Science and Technology Board (CSTB). 1997. The Evolution of Untethered Communication. Washington, D.C.: National Academy Press. Feynman, Richard P. 1960. “There’s plenty of room at the bottom: An invitation to enter a new field of physics.” Engineering and Science. California Institute of Technology: American Physical Society, February. Geppert, L., and T.S. Perry. 2000. “Transmeta’s magic show.” IEEE Spectrum 37(5). Gonzalez, R., and M. Horowitz. 1996. “Energy dissipation in general purpose microprocessors.” IEEE Journal of Solid-State Circuits (September):1277-1284. Hill, J., et al. 2000. “System architecture directions for networked sensors.” Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, Mass., November 12-15. Lee, Edward A. 2000. “What’s ahead for embedded software?” IEEE Computer (September):18-26. McCrady, D.D., L. Doyle, H. Forstrom, T. Dempsey, and M. Martorana. 2000. “Mobile ranging using low-accuracy clocks,” IEEE Transactions on MTT 48(6). Parsons, David. 1992. The Mobile Radio Propagation Channel. New York: John Wiley & Sons. Rappaport, T.S. 1996. Wireless Communications: Principles and Practice, Englewood Cliffs, N.J.: Prentice Hall. Semiconductor Industry Association (SIA). 1999. Semi-Annual Report. San Jose, Calif.: SIA. Sohrabi, K., J. Gao, V. Ailawadhi, and G. Pottie. 1999a. “Self-organizing sensor network.” Proceedings of the 37th Allerton Conference on Communications, Control, and Computing, Monticello, Ill., September. Sohrabi, K., B. Manriquez, and G. Pottie. 1999b. “Near-ground wideband channel measurements.” Proceedings of the 49th Vehicular Technology Conference. New York: IEEE, pp. 571-574. Sommerfeld, A. 1949. Partial Differential Equations in Physics, New York: Academic Press. Trimmer, William. 1997. Micromechanics and Mems. New York: IEEE Press. Van Trees, H. 1968. Detection, Estimation and Modulation Theory. New York: John Wiley & Sons. Wait, J.R. 1998. “The ancient and modern history of EM ground-wave propagation,” IEEE Antennas and Propagation Magazine 40(5):7-24. BIBLIOGRAPHY Agre, J.R., L.P. Clare, G.J. Pottie, and N.P. Romanov. 1999. “Development platform for self organizing wireless sensor networks.” Presented at Aerosense’99, Orlando, Fla. Asada, G., M. Dong, T.S. Lin, F. Newberg, G. Pottie, H.O. Marcy, and W.J. Kaiser. 1998. “Wireless integrated network sensors: Low power systems on a chip.” Proceedings of the 24th IEEE European Solid-State Circuits Conference. Bult, K., A. Burstein, D. Chang, M. Dong, M. Fielding, E. Kruglick, J. Ho, F. Lin, T.-H. Lin, W.J. Kaiser, H. Marcy, R. Mukai, P. Nelson, F. Newberg, K.S.J. Pister, G. Pottie, H. Sanchez, O.M. Stafsudd, K.B. Tan, C.M. Ward, S. Xue, and J. Yao. 1996. “Low power systems for wireless microsensors.” Proceedings of the 1996 International Symposium on Low Power Electronics and Design, pp. 17-21. Chatterjee, P.K., and R.R. Doering. 1998. “The future of microelectronics.” Proceedings of the IEEE 86(1):176-183. Dong, M.J., G. Yung, and W.J. Kaiser. 1997. “Low power signal processing architectures for network microsensors.” Proceedings of the 1997 International Symposium on Low Power Electronics and Design, pp. 173-177.

OCR for page 39
Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers Jones, Mike. 1997. “What really happened on Mars?” The Risks Digest: Forum on Risks to the Public in Computer and Related Systems 19(49), Available online at <http://catless.ncl.ac.uk/Risks/19.49.html#subj1>. Lin, T.H., H. Sanchez, R. Rofougaran, and W.J. Kaiser. 1998. “CMOS front end components for micropower RF wireless systems.” Proceedings of the 1998 International Symposium on Low Power Electronics and Design. Merrill, W.M. 2000. “Coax transition to annular ring for reduced input impedance at 2.4 GHz and 5.8 GHz.” Proceedings of the 2000 IEEE Antennas and Propagation Society International Symposium, Salt Lake City, Utah, July 16-21. Merrill, W.M. 2000. “Short range communication near the earth at 2.4 GHz.” Proceedings of the 2000 USNC/URSI National Radio Science Meeting, Salt Lake City, Utah, July 16-21. Pottie, G.J. 1999. “Wireless multiple access adaptive communication techniques.” In Encyclopedia of Telecommunications, Vol. 18. F. Froelich and A. Kent, eds. New York: Marcel Dekker Inc. Proakis, J.G. 1995. Digital Communications, 3rd ed. Boston, Mass: WCB/McGraw-Hill, pp. 855-858. Reed, J.H., K.J. Krizman, B.D. Woerner, and T.S. Rappaport. 1998. “An overview of the challenges and progress in meeting the E-911 requirement for location service,” IEEE Communications Magazine (April): 30-37. Reeves, Glenn. “Re: What really happened on Mars?” The Risks Digest: Forum on Risks to the Public in Computer and Related Systems 19(49). Available online at <http://catless.ncl.ac.uk/Risks/19.54.html#subj6>. Yao, K., R.E. Hudson, C.W. Reed, D. Chen, F. Lorenzelli. 1998. “Blind beamforming on a randomly distributed sensor array system.” IEEE Journal of Selected Areas in Communications 16(8):1555-1567. Yu, T., D. Chen, G.J. Pottie, and K. Yao. 1999. “Blind decorrelation and deconvolution algorithm for multiple-input, multiple-output systems,” Proceedings of the SPIE, 3807.