Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 21
Chapter 2 MICROELECTRONIC SYSTEM TRENDS AND PACKAGING NEEDS Electronic sys tems needed in the next few years will require unprecedented packaging technology. In this chapter, the demands placed on the packages by anticipated chip technologies are discussed. ~ rem the approach is to list requirements that, if met, will ensure that the inherent performance capabilities of the chips can be achieved and will not be degraded by the package. Some of these requirements deal with the interfacing of individual chips, whereas others deal with the interconnecting of groups of chips . In today's technology, these two functions are most often fulfilled by first-level packages, such as dual in-line packages, and by second-level packaging, such as printed-circuit boards, respectively. Packaging requirements for the mid-199Os are important to avoid any implicit assumption that the In particular, of interest here. It is . _ packages, appropriate at that time, can be categorized in the same way they are now. Indeed, there is already strong evidence that combining traditional packaging levels can lead to ~ moroved performance For example, the IBM "thermal conduction" module eliminates one level of packaging by combining two levels into a single structure. In this chapter, ' ~ ~ ~ ~ ~ _ ~t lnC .lVlC .ua. several chips; For ~ -- important: _ · die attachment chip pinout pinout conf figuration heat removal signal rise time power lead inductance power supply current interline coupl ing , that deal with interfacing c lips are covered, as are those that deal with interconnecting the terms first-level and second-level packaging are not used. interlacing to a single chip, the following additional requirements are those requirements protection from the environment 21
OCR for page 22
22 For connecting two or more chips, the following additional requirements are important: ~ wiring configurate on · propagation delay · signal rise time Each of these requirements must be satisfied at acceptable cost and reliability, without adversely affecting the other requirements. It is assumed in this chapter that the signal-interconnection techniques do not employ multiplexing or optics, so there is exactly one electrical connection per signal (plus, perhaps$ multiple ground and power pins). It is possible to estimate many of these requirements rather accurately by extrapolating past characteristics of chips and systems, using scaling theory as a guide. The various types of chip scaling and the related theories are discussed in the next section. Scaling theory alone is not sufficient because different chips will be built using different architectural styles. An empirical relationship between pinout and circuit complexity, known as Rent's rule, can be used to characterize architectural style with sufficient accuracy for this discussion. Dead discussed in a later section. `~ ~ rule and related items are also Three principal system types have been identified, each of which appears to make different demands on packaging: low-end digital, high-end digital, and high-speed. Low-end digital systems typically use silicon MOS circuits packaged separately? with printed wiring boards (PWB) for chip interconnection. High-end digital systems typically use silicon bipolar technology, often with packages in modules that carry many chips. The advent of bipolar complementary (BiCMOS) technology will blur the distinction between the two digital system types in the future, but for the purposes of this study it is assumed that BiCMOS will have interconnection requirements similar to CMOS and power requirements intermediate between CMOS and bipolar. High- speed circuits typically use gallium arsenide (GaAs) chips. The assumptions about the chip technologies are presented in later sections in this chapter. The requirements listed above are qualified, where possible, for each of the system types. S CALING THEORY Scaling theory Is important in understanding the driving forces that affect the trends of integrated circuit chips. The semiconductor industry learned, more than 20 years ago, that shrinking the photolithographic dimens ions on the wafer and increasing the chip and wafer size increased the productivity of the semiconductor plant. The benefit to the user was lower cost per circuit, more functions per chip, and higher performance. The end result has been a quadrupling of the level of integration every 3 years. This section draws on scaling theory to permit a proj ection of the performance
OCR for page 23
23 trends of bipolar and MOS integrated circuit chips and of generic module and board configurations. The intent is to understand the evolutionary trends and try to determine the material properties that may limit or even dead-end those trends. Lithography Lithography is of fundamental importance in semiconductor fabrication, and, therefore, a look at lithography is needed to get a feel for the future direction of the semiconductor industry. A key parameter is the minimum feature size that a given lithographic technology is capable of patterning on a chip at a given point in time. Minimum feature size has a first-order bearing on both circuit performance on a chip as well as the circuit density. To predict future packaging needs in the mid-199Os, it is important to have some feeling for what minimum feature size can be patterned in production in that time frame. Bakoglu (1986) points out that between 1959 and 1983 the minimum feature size shrank at an average rate of 11 percent per year. Assuming that in 1988 the minimum feature size being patterned in production is about 1. O ~m, then by mid-1995, or seven years later, the minimum feature size will be about O.S ~m. This assumes that the feature size will shrink about 10 percent per year. This is optimistic, since this dimension is near what is generally agreed to be the limit for optical lithography. It is an accepted fact that the rate of progress decreases as the limit of the technology is approached. Optical lithography has been used since integrated circuits were first invented. The minimum feature size that optical lithography is capable of producing is limited by the wavelength of light used, and it therefore has a very fundamental limitation. Electron beam lithography can provide very small feature sizes with the use of proper photoresist material. However, it is highly doubtful that electron beam systems will ever be used in a production environment because of their slow throughput caused by their limited bandwidth. It is proj ected (Kern et al., 1988) that optical lithography sources are expected to be available for resolutions down to about O. 35 Am and will be extens ively used into the 19 90s . The one technology that has the capability of providing a minimum feature size, smaller than 0.35 Em in volume production, is x-ray lithography. Whether an x-ray lithographic production system can be developed and installed by 1995 is dependent on many factors. Two major nontechnical factors are the need and the return on investment. Memory chips provide both a need, because of increased density requirements, and a better return on investment than logic chips, because of higher volume per part number, fewer part numbers, and increased yield. The increase in yield is a result of the fact that very small dirt particles are transparent to x-rays. If the very difficult technical problems associated with introducing x- ray lithography into a manufacturing operation can be overcome, then it is quite probable that it will be first used to fabricate memory chips, for these reasons.
OCR for page 24
24 In view of the foregoing discussion, it will be assumed that during the mid-199Os the minimum feature size, practiced in production volcanoes, will be about O.5 ~m. Scaling of lIOSFETs As photolithographic techniques have improved, it has been possible to reduce the minimum feature sizes on chips. However, power supply voltages tend to remain standardized for economic reasons. As a result, it is not too meaningful to perform a scaling analysis while holding the electric field constant in the device being scaled. Baccarani and coworkers (1984) and Dennard (1986) have developed the general scaling relationship shown in Table 2-1. In this analysis, ~ is the factor by which the dimensions are reduced and c/a is the factor by which the applied voltage and threshold voltage are multiplied. The depletion regions are scaled down, along with the other dimensions, by multiplying the doping concentration within the scaled device by the factor ea. Table 2-1 Generalized Scaling Relationships Physical Parameters Scaling Factors Linear dimensions Electric field intensity Voltage (potential) Impurity concentration Wiring current density Gate delay 1/x · 1 · 1/a · 2 ~ ·O 1/ ~ · 1/~ Powe r /gate ~ 3 · 1/~2 Source: Based on Baccarani et al., 1984; Dennard, 1986 When the power supply voltage is held fixed, then ~ = ke, where k is a proportional ity cons tent , and the gate delay scaling factor becomes k/~2. There are limits to how far generalized scaling can be extended, since, as increases, gate-to-insulator failure increases and hot-carrier mechanisms produce long-term degradation. In addition, the current density in devices and metallization increases, which can lead to electromigration-type failures. Another problem that is aggravated as device dimensions shrink is the effect of alpha particles.
OCR for page 25
25 Therefore, within limits, as the device dimensions on a chip shrink, the delay per gate decreases, as does the power per gate. Devices have been made and tested (Dennard, 1986) that confirm the scaling analysis. The experimental results show that a loaded NMOS NOR circuit constructed with a 1.0 micrometer channel length and again with a 0.5 micrometer channel length exhibited gate delays of approximately 1. O and O.5 nsec respectively. The power per circuit decreased by about a factor of 4, to about 50 low dissipation. In this experiment, the power supply voltage was scaled by about a factor of 2, from 2.5 to 1.2 volts, so as to keep the electric field in the device constant. These scaled NMOS NOR circuits were patterned with a vector-scan electron beam exposure system having a capability of producing 0.5 Em features with a standard deviation of +0.05 Em in both feature size and level-to-level overlay. Circuits with channel lengths as short as 70 nm have been fabricated with five levels of electron-beam lithography overlaid, with an accuracy of better than 30 nm (Kern et al., 1988; Sai-Halasz et al., 1987~. In a ring oscillator, these silicon field effect transistors (FETs) have a delay per stage of 13 psec. In addition to the technical problems associated with shrinking the dimensions of NMOS devices, this technology has a serious competitor in the form of CMOS. It is hi ghly probable that CMOS will be the dominant technology by the mid-1990s, with bipolars and GaAs relegated to specific applications where their unique properties are superior to those of CMOS. Scaling of CMOS The CMOS technology has a very important and positive characteristic because its circuits dissipate power essentially only during switching. When these circuits are in a quiescent state, they consume very little power. This is an important feature, since power that is not consumed does not need to be removed. In addition, the copper required in the conductors supplying power to the back planes and modules is greatly reduced. The designer of a CMOS system must not, however, expect to operate very densely packaged chips at very high clock rates without due concern for heat dissipation. The nature of a CMOS gate is that its power dissipation increases in direct proportion to its clock rate, all other terms held constant. Noise generated by these gates increases at high clock rates to such an extent that they may not be sufficiently noise- tolerant in digital systems beyond clock rates of 75 to 100 MHz. It is doubtful that high-clock-rate problems can be corrected by either scaling or by operation at liquid nitrogen temperatures. (Appendix C gives projections on operating and structural parameters.) CMOS had to overcome two major handicaps before it became a very acceptable technology: latch-up and a complex manufacturing process. Latch- up, caused by the current gain of parasitic lateral transistors, produces a high current path between the power supply and ground lines, a feature that destroyed early CMOS chips. This is no longer a problem. The CMOS manufacturing process approaches the complexity of the bipolar process. By slightly increasing the complexity of the CMOS process, bipolar transistors
OCR for page 26
26 can be fabricated on the same chip as CMOS devices. Bipolar transistors have a much higher transconductance, or di/dv (change of output current to input voltage change), and as a result take up less area on a chip for a given drive capability. Baccarani and associates (1984) applied general scaling theory to a 0.25 Em NMOS FET and calculated that a device with a fan-out of 3 would have a gate delay of about 200 psec, with a power dissipation of 50 low at a power supply voltage of 1.0 volt. In relating their results to CMOS, they state, "Due to the lower hole mobility, and to the larger sheet resistance of pa shallow junctions, however, quantitatively different results are obtained in this case, leading to somewhat modified design tradeoffs." They are saying that the design of the e-channel PET in the CMOS circuit must be optimized differently than the p-channel PET. The lower hole mobility will adversely affect performance of the CMOS circuit. Boudon and associates (1988) describe a 20K two-way NAND equivalent CMOS gate array prototype with 0.5 pm channel length FETs. The 7.5- x 7.5-mm chip is designed for high performance, with 245 psec gate delay with a fan-out of 3. A 32-bit reduced-instruction set computer (RISC) processor, with a 16- x 16-bit multiplier implemented on the chip, has been measured at 17 nsec cycle time with a 3.4-volt power supply. This experimental result is 1.6 times faster than the same implementation with a CMOS 0.9 ,um gate. Cong and associates (1988) describe a low-power CMOS dual-modulus (divide by 128/129) prescalar integrated circuit. They point out that the prescalar has been traditionally implemented in GaAs or bipolar technologies. The best prescalar fabricated with 17.5 nm gate oxide, functions at 2.06 GHz with 25 mw power consumption. The channel length is 0.5 Am and operates on a supply voltage of 3.5 volts with ring oscillator (unloaded) delay of 110 psec . When CMOS devices are designed for low-temperature operation, at liquid nitrogen temperatures of 77 K, the circuit speed is enhanced by a factor of two. The reasons include decreased leakage, increased carrier mobility, sharper subthreshold turn-off transition, lower interconnect line resistance, and improved reliability (Sai-Halasz et al., 1987~. In addition, latch-up effects are greatly reduced at low temperatures because of lower bipolar gains. As device dimensions decrease, the benefits of operating at 77 K become more attractive. Scaling of Bipolars The bipolar transistor has had a long history of development. During the period when it was produced as a discrete device, two techniques were invented to improve its performance by keeping it out of saturation. They were the Schottky clamp-c~rcuit, invented in 1953, and the emitter-coupled logic (ECL) circuit family, invented in 1956. Since the invention of the integrated circuit, many improvements have been made to the bipolar device structure In parallel with these improvements have been improvements in photolithography that have reduced the size of the device, with an attendant . . Increase In per :ormance.
OCR for page 27
27 A few of the important structural improvements made to bipolar planar devices are self-aligned base contact, deep-trench isolation, and a polysilicon emitter contact. Both the self-aligned structure and the trench isolation greatly reduce the device area and the associated parasitic capacitance, and hence significantly reduce the power-delay product and increase the density of bipolar circuits (Nina and Tang, 1986). Experimental evidence has overwhelmingly shown that polysilicon emitter contacts make it possible to vertically scale bipolar transistors and improve circuit performance without unacceptable degradation in current gain. Ning and Tang (1986) state, "The trend in bipolar device technology is then to develop the version or versions of self-aligned structure, deep- trench isolation, and polysilicon emitter contact that are manufacturable applicable to both high-speed as well as high-density applications....The central idea is to reduce the horizontal and vertical dimensions in a coordinated manner so that all the key delay components are reduced approximately proportional in scaling." The scaling rules for ECL circuits are shown in Table 2-2. The projected delay as a function of the switch current of the scaled circuit is shown in Figure 2-1. Reduction in gate delay can be expected as chip power is increased in projected future circuits. Table 2-2 ECL Scaling Rules - Parameter Rule* Base width, Wb ao.8 Base doping level, Nb Wb~2 Collector current density, Jo an Collector doping level, Nc Circuit delay Jc a *a ~ minimum feature dimension and emitter-stripe width Source: After Ning and Tang, 1986 It can be seen from Figure 2-1 that the maximum benefit in performance, from scaling, is obtained when the current, and hence the power, is held fixed. Naturally, it is possible to reduce the current as the emitter width is reduced and accept a smaller improvement in performance. This approach has generally been resisted by the ECL enthusiasts, since they constantly strive for improved performance. As a result, ECL-based systems consume quite a bit
OCR for page 28
28 of power (2 to 5 mw per circuit), which must be supplied and removed. A further complication is that the power supply voltage is in the range of 2 volts, which means that a system with a power requirement of 5000 w will require a current of 2500 A. This magnitude of current requires a copper conductor of very large cross section between the power supply and the circuit modules or board. GALLIUM ARS ENIDE TECHNOLOGY An additional category of digital components, which has emerged from the laboratory and entered general use during the 1980s, is that of the extremely high-speed devices. Such devices, because of the lower level of integration at this time, typically exhibit a smaller number of signal pins than, for example, CMOS chips, but in a few years, they can be expected to have the same pinout needs as today's slower-speed silicon counterparts. Logic gate delays in these ultrafast chips of as little as 10 to 100 psec make it possible to design signal processors that are already achieving clock rates as great as 2 GHz. Furthermore, gallium arsenide (GaAs) digital integrated circuits have been demonstrated in the laboratory that perform useful functions at even higher clock rates, of up to 25 GHz. Present GaAs chips of 1000 gate complexity, capable of 6 GHz, are available in experimental quantities (in 1989), 10-GHz digital chips will be available by 1990 to 1991, and 20- to 25- GHz chips are expected to be readily available by 1995. GATES/CHIP ( P = 2 WATTS) 1000 cat a) in ~ 100 LL 50K 20K dOK 5K 2K 1K I I ~ I I I I I l I I I I I I L \ \ _ \ _ 2.5~m\ \ 0.25~ I 1 1 1 1 1 11 10 0.01 0.1 1.0 CURRENT (mA) FIGURE 2-1 Gate delay for a 2 watt chip as a function of switch current. The Awry. refer to emitter stripe width . (After Nine and Tang, 1986)
OCR for page 29
29 Very-high-frequency digital integrated circuits are employed in a wide range of equipment, including supercomputers, telecommunications transmission equipment, communications satellites, radar, video image processing, military electronic countermeasures, image processing, and air traffic control displays. Silicon ECL components have been used traditionally, and GaAs components are now being introduced. Clock rates have been increasing steadily and are now at about 0. 5 GHz . The fast digital portions of these circuits may be small enough to fit onto a multichip module, which will be the heart of the system. Very-high-speed electronics require attention to electromagnetic issues that are often unfamiliar to digital systems designers. Fast rise-time devices radiate electrical energy in that portion of the electromagnetic spectrum traditionally reserved for analog microwave communication channels and radar systems. Bandwidths must be preserved as the signals propagate through the packaging and interconnect structures, if the robustness and noise immunity of the processors are to be maintained. Despite these problems for interconnect designers, these enabling technologies will be pursued vigorously during the next decade, because the higher system clock rates can lead to signal processing rates that are one or two orders of magnitude greater than those currently available. RENT'S RULE About 1960, Edward Rent, working at IBM, observed a relationship between the complexity of logic circuit (expressed, for example, by the number of gates in it) and the number of signal wires (pins) connecting to it. [Rent himself never published an account that bears his name, but two early references describe the relationship (Logue, 1966; Landman and Russo, 197113. In its simplest forest, it is Np Kiev where Np is ache number of pins, Ng is the number of gates, and Kp and ~ are constants. The relationship has been applied to a variety of systems, including digital computer systems, integrated circuits, random logic, and even animal eyes and brains. Rent's rule is used here to predict the pinout of future integrated circuits and the interchip wiring complexity of highly parallel computer architectures of the future. Two empirical constants, ~ and a, appear in Rent's rule. Of these, is the more critical. In a two-dimensional world, such as inside an integrated circuit or on a printed-circuit board, the rule is qualitatively different, depending on whether ~ is above or below 0.5. If ~ is greater than 0.5, then, as more and more complexity is added to a circuit, the circuit becomes harder and harder to wire. To appreciate this, consider a chip on which the perimeter is used for bonding pads, and suppose that all the space on the (one-dimensional) perimeter is used for these pads. If the size of the circuit quadruples (e . g. , by making each dimension of the chip twice as
OCR for page 30
30 large) ? the required number of pins more than doubles, yet the perimeter only doubles, and as a result not all the required pins will fit in the available space. A similar argument applies to wiring within the chip if the subcircuits on the chip themselves obey Rent's rule with the same exponent. Values of ~ below O.S do not pose such difficulties. Incidentally, in a three-dimensional setting, the critical exponent is 2/3 rather than l/2, because this is the exponent governing the ratio of surface to volume. Recent examinations of Rent's rule (Bakoglu, 1986; Ferry, 1985) have focused on the critical nature of the exponent, and it has been observed that different styles of system architecture or different types of systems seem to be characterized by different exponents. The values reported by Bakoglu (1986) are 0.63 for chip- and module-level design of high-speed computers (in agreement with Rent's original value), 0.5 for gate arrays, 0.45 for microprocessors, 0.2S for board- and system-level computers, and 0.12 for memories. The value reported by Ferry (1985) is 0.21 for a mix of logic, microprocessors, and memory. Bakoglu's value for microprocessors appears to be heavily biased by a single early example and two RISC chips; without them, the value is less than O.2. Rent's rule is empirical, and empirical observations invite fundamental explanations. It may be that an exponent of 2/3 can be explained by the surface-to-volume ratio of a design produced by evolution that is truly three- dimensional, such as animal brains. It is also obvious that memories should have a low exponent, since address coding permits the number of address pins to be a logarithmic function of the size of the memory. For the other types of systems, however, fundamental explanations seem less satisfactory. However, one fundamental distinction does seem appropriate. Ferry (1985) attributes to McGroddy and Solomon (1982) the distinction between highly partitioned and functionally partitioned circuits. The former are defined as those for which chip or module boundaries do not tend to coincide with system or subsystem boundaries. Gate arrays, random TTL logic, and indeed designs where many components are required for a system, are like this. Functionally partitioned circuits, on the other hand, are defined as those in which the chip or module boundaries do coincide with system or subsystem boundaries. Microprocessors are like this. The definition of what is a subsystem is a human one, based on the partitioning of a total system for easier human understanding. Human understanding is more likely to occur when the subsystems do not have complex interactions but instead interface with minimal information interchange. It appears that the following values of constants appear to characterize different types of chips and systems: · Memory chips, ~ - 6, ~ ~ 0.12 Functionally partitioned chips, ~ ~ 10, ~ ~ 0.2 Modules and boards, ~ z 82, ~ ~ 0.25 · Highly partitioned chips, ~ = 2, ~ ~ 0.5
OCR for page 31
31 These values are generally consistent with the data presented by Ferry (1985), although they differ from the numbers given by Bakoglu (1986~. It is likely that the data for gate arrays (highly partitioned) are based on the fact that, in most present packaging schemes, signal pins are located on the periphery of chips. As a result, a natural evolution from one gate array to the next, keeping design style and design tools similar, will necessarily scale the pinout as the square root of the number of gates. Thus, the pinout of highly partitioned chips may in fact be limited more by the interconnect technology available than by the inherent needs of the logic, and therefore, in designing packages for the future, perhaps higher exponents might be appropriate. Other types of chips can be categorized according to whether they are highly partitioned or functionally partitioned. For example, systolic arrays and some signal-processing chips may be functionally partitioned, whereas the "glue logic" that seems to surround microprocessors in many systems is probably highly partitioned. CHIP TECHNOLOGIES The committee's assumptions about chip technology in use in systems in the mid-1990s are summarized in Table 2-3. These data were supplied by Donald R. Franck of the Empire Planning Group (personal communication to the committee, October 1988), except as follows: The linewidth estimates are justified earlier, and the inter-latch delays are an assumed logic depth (20 for MOS, 15 for bipolar, and 10 for GaAs) times the gate delay, plus an estimate of on-chip wiring delay. This estimate is the "Elmore time constant" of an aluminum wire 4 mm long with the linewidth, cited as 0.3 Am thick, over and under 0.5 Am thick oxide insulators; for GaAs, silicon nitride insulator above and insulating GaAs below (Elmore, 1947; Rubenstein et al., 1983~. This wire has a resistance of 700 ohms and a capacitance of 0.4 pF, for an "Elmore time" of 140 psec (1 pF and 350 psec for GaAs). Even the relatively long length of 4 mm assumed here will require restraint on the part of circuit designers, since chip sizes are expected to be up to 3 cm on a side in 1994, and thus circuit designers will have to use careful placement of combinational logic blocks and perhaps buffers for long signal paths. The clock frequency calculated assumes that during half a clock cycle a signal must settle and the settling time should be at least 1.5 times the inter- latch delay. In other words, the clock period is three times the inter-latch delay. The power supply current is the power divided by the assumed supply voltage, and the power per gate is calculated from the chip power and gate count.
OCR for page 32
32 Table 2-3 Mid-1990s Integrated Circuit Chip Technologies Chip Interface CMOS Bipolar GaAs Linewidth (pm) 0.5 0.5 0.5 Gate count 400,000 20,000 100,000 Power and signal pinout 600 600 300 Pinout configuration Two dimensions Two dimensions One dimension Device gate delay (psec) 200 40 50 Inter-latch delay (nsec) 4.1 0.74 0.85 Clock frequency (MHz) 80 450 250 Power supply voltage (V) 3 1.3 2 Power (W) 20 40 20 Current (A) 6.7 33 10 Power per gate (up) 50 2000 200 Source: Based on data from D. R. Franck (personal communication to the committee, October 1988) and some prepared by the committee. The "pinout configuration" entry in Table 2 - 3 requires an explanation. Consider the problem of providing pins for integrated circuits, which are planar (two-dimensional). The "boundary" of a two-dimensional region is one- dimensional, in this case the perimeter of the chip. Modest pinout (say up : 300) can be satisfied by one row (or two) of pads at the perimeter of the chip, and, in fact, most chips fabricated today use perimeter bonding pads. Therefore, without revolutionary reductions in pad size, the larger pinout that will be needed in 1994 cannot be satisfied with perimeter pads, so the two-dimensional chip area must be used. Indeed, this technology is in some use even today. Thus, the demand for more pinout must be satisfied by ''escaping" to a higher dimension. I f the full performance of the chips, as summarized in Table 2 - 3, is deco be realized in a system, the packaging requirements listed in Table 2-4 are necessary. Any deviations from meeting these specifications will force compromises on chip and system designers and will, therefore, mean that system performance is limited more by the packaging than by the chips.
OCR for page 33
33 Chip Interface Packaging For interfaces to the chip not to inhibit the chip performance described earlier, the following packaging requirements apply (see Table 2-4~. The package pinout must, of course, equal the chip pinout. The package pinout configuration cannot be accommodated using the perimeter of the package for the same reasons that this will not be possible for chips. The chip power cited is from D. R. Franck (personal communication to the committee, October 1988~. The chip drivers and receivers, together with the package signal lead inductance, must be capable of responding in the inter-latch delay cited above--in other words, in about a third of the clock cycle. The power (and ground) lead inductance is calculated by requiring that the L di/dt voltage dropped across the pins not exceed 0.1 times the supply voltage, when as much as 50 percent of the current for MOS, 5 percent of the current for bipolar or 10 percent of the current for GaAs is switched in a time equal to the signal rise time. Clearly, this requirement cannot be met unless multiple pins are used for both ground and supply voltage. This requirement can, however, be relaxed if a multiphase clock or on-chip voltage regulation is used. Table 2-4 Mid-ls9ns Chin Tnt.-rfn~- T-~hnnl non Chin Interface Low-End Digital Hi~h-end Digital High-Sneed Chip pinout 600 600 300 Package pinout configuration Two dimensions Two dimensions One dimension Heat removal per chip (W) 20 40 20 Signal rise time (nsec) 4.1 0.74 0.85 Power lead inductance (nh) 0.4 0.07 0.17 DC power supply current (A) 6.7 33 10 Environmental protection Essential Essential Essential Chip Interconnection Packaging If the signal propagation delay from chip to chip and the signal rise time for interchip communication at least match the inter-latch delay for the chips, then signals can be transmitted from one chip to another during a single clock cycle, and the packaging will not substantially degrade system performance. The requiremer,ts are given in Table 2-5.
OCR for page 34
34 Table 2 - 5 Mid- l990s Chip Interconnection Technology Chip Interconnection Wiring configuration Two dimens ions Propagation delay (nsec) 4.1 Low- End Digital High- End Digital High- Speed Three dimens ions Two dimens ions 0.74 0.85 Signal rise time (nsec) 4.1 0.74 0.85 The key requirement for interchip packaging is the ability to have a large number of interconnect wires between and among chips. This is necessary whenever an overall system is too complex to be put on a single chip. Generally, total systems have a relatively small number of signal pins, because systems with complex interfaces are difficult to understand and systems are, after all, defined by humans who must understand their input and output behavior. The need for complex interconnections arises when the limitations of chip technology force a system to be implemented on more than one chip. Systems are conceived in all sizes, and, therefore, it is difficult to be quantitative in the general sense about the interconnect needs. For this reason, no estimates are given regarding pinout of modules that perform chip interconnection. Rent's rule for highly partitioned chips and modules is probably valid for systems, both high-end and low-end, that are sufficiently complex so that many chips are necessary. The entry "wiring configuration" in Table 2-5 requires further discussion. Today, the most common interchip wiring is done on printed wiring boards (PWBs) in which a very small number of two-dimensional routing surfaces are used. This works well only for limited chip pinout and limited board pinout. It works best for chips with perimeter bonding, or whose first-level packaging provides perimeter connections, because of the difficulty of using essentially a two-dimensional scheme to connect to a two-dimensional pinout array, given the normal wire size, spacing needed to reduce crosstalk and adj acent- conductor shorts, and the pad or connector size . Chip pinout for 1994 will require, for systems with several chips, a correspondingly high number of connections between chips. For example, consider a system that requires several 1994 chips, each with pinout of 600. The partition of functionality among the two chips might be "highly partitioned"' in the sense used earlier, with a Rent's rule exponent of 0.5. In that case, two chips taken as a unit would, between them, need 850 wires to connect with the rest of the system. The remaining 350 pins from the two chips might go between these chips, implying the need for 175 signal paths between these adj acent chips . (This number might be increased slightly because some electrical nodes have multiple connections. ~ The required interconnect dens ity, although difficult to quantify in. general because of the varied size of systems and the degree of partitioning
OCR for page 35
35 necessary, is clearly beyond the capabilities of today's PWB technology and also will be difficult to satisfy with advanced multichip modules. It is believed that, for high-end systems with many chips, the wiring congestion can be overcome only by using a three-dimensional interconnect structure. By this is meant a structure in which the wiring dens ity in the third dimension is comparable to that in the other two dimensions. This kind of structure actually is not as far- fetched as it sounds; the IBM thermal conduction module and today's best PWBs have horizontal and vertical pitches for horizontal wires, and horizontal pitches for vertical wires, that are within a factor of five. In the case of the thermal conduction modules, the horizontal spacing between wires in one plane is 5 mils, the distance between planes is 10 mils, or 20 mils if a ground plane lies between for shielding, and ache pitch of vies is 25 mils. In contradistinction to the moderate clock rate described throughout this report, packaging intended for the fastest clock rate devices (both silicon and GaAs) must assume that the interchip signal connections will be transmission line in nature. It is difficult to understate the impact on chip interconnect caused by the need for a transmission line environment on the substrate, which, as a rule of thumb, arises whenever the off-chip signals exhibit risetimes of 2 nsec or less. The risetimes of typical silicon ECL and GaAs components are all less than 1 nsec at present. It should be noted that the fastest risetimes are currently in the 200-psec range, with 100 psec achieved on a small subset of the very fastest GaAs devices intended for communications applications. Although even 30-psec risetimes have been demonstrated, these ultrashort risetimes will not be necessary until the mid- 1990s, when clock rates exceeding 10 GHz are employed in communications and radar processors. Driven by the operating parameters of high-clock-rate systems described above, all of the parameters of concern for silicon CMOS chips assume even greater importance for the fastest devices. The use of single-chip surface- mounted packages is already giving way to the method of placing bare chips nearly side by side on very dense metal-on-organic dielectric structures (e.g., copper-on-polyimide or the equivalent). The ability to fabricate very uniform transmission line structures on these dense "chip-on-board" substrates, with low DC and AC resistive losses in the lines, will be important. To minimize the high-frequency crosstalk between densely-packed interconnect lines, very low values of dielectric constant (e' about 2.0) will be required for the materials that separate the signal planes from their ground reference or shield planes. Such low i' values not only increase wavefront propagation velocities, but also allow ground planes to have minimum separation from the s ignal planes for a given line impedance, thereby decrees ing interline coupling effects . The interplane dielectrics must also not be lossy and must not become lossy at higher frequencies because of adsorbed or chemically-bound water. For interconnection between the chips and the substrates, the frequency limits of wire bonding must be better assessed, and current TAB technology must be extended (e.g., with "flip-TAB" or modified ''flip-chip'' techniques) to provide improved high-frequency transmission line behavior and shorter total lengths of the TAB structures. Finally, the
OCR for page 36
36 ability to provide integral high- frequency local power plane decoupling adj acent to the active chips must be provided in a more cost - effective manner than i s done now . The flip-chip techie que protrudes an excellent ultra-high-frequency connection between chip and substrate. The technique permits the chip to be removed a limited number of times. However, the ability to provide signal integrity, with demountability between the bICM and the back panel, becomes increasingly more difficult as signal risetimes approach 100 psec. When innovative approaches (e . g., fuzz buttons and elastomeric materials) are cons idered for solving connector problems of high-performance electronic systems, materials issues must be considered. (See Appendix D for two innovative approaches. ~ SOME PACKAGE DES IGN CONS IDERATIONS The single - chip and multichip modules are described irk this section to point out the techniques used to handle the interconnects and the problems encountered in each type. Single-Chip Modules Single-chip modules (SCMs) can be divided into two categories: the first is the surface -mount module, and the second is the pin- through-hole module. The surface-mount package, of necessity, requires that the leads come out around the perimeter of the package. This, therefore, requires that the lead pitch becomes finer because the number of leads supplying signals and power to the package increases as more circuits are placed on the chip. To support 400 to 700 signal and power connections to a surface-mount package in a surface-mount configuration in the 1994 time frame would require about a 12- mil lead pitch. The leads would have to be staggered, since the card or board to which they are soldered or otherwise connected must have more than one plane of connections for signal and power. it is doubtful that vies that make connections from the surface to internal planes can be placed on ~ 2-mil centers . The problem is solved by staggerlug the length of the 7 eads exiting from the surface-mount module. Handling 400 to 70C very fine leads that have a pitch of 12-mile is a difficult problem in production, as is the problem of removing 15 to 20 W of heat from the chip in this package. The pin- through-hole single-chip module has the pins in an area array. The pins are inserted into holes in the card or board or the next level of package. It is conceivable that the singlet-chip module need not have pins, but it could have contacts that mate with contacts on the next level of package. In any case, it is necessary to have vies under the module that permit connection to the internal wiring planes in the next level of package. If pins do not need to be inserted in holes, then these holes can be smaller in diameter than holes that must contain pins. In any case, there is wiring congestion under the module. If, as is usually the case, there is a limit to the number of lines that can pass between two vies, then more wiring planes
OCR for page 37
37 are needed in the next package level to overcome the wiring congestion under the module. Multichip Modules The advantage that a multichip module (MCM) provides is greater density at both the module level and at the system level, because with the greater dens ity comes improved performance from the shorter transmission paths. This improved performance is attained in present MCMs even though the dielectric constant of the ceramic is approximately nine. Since the velocity of propagation is proportional to (-as, the velocity of signals propagated on lines within the ceramic is one-third of the velocity of light. Future MCMs will require insulating materials whose dielectric constants are as close to 1 as possible. Naturally, the insulating materials used must be patternable; that is, it must be possible to make via holes in the material that can be filled with a conductor to provide a connection path from one level of conductor to the next. The insulator must permit a conductor to be attached to it by some means, such as evaporation, sputtering, or vapor deposition. The process of laying down the insulator and patterning should be a dry process, because dry processes cause fewer ecological problems and provide high resolution in a patterning process. The chips, mounted on the MCM, must be attached in such a way as to provide electrical connection to the 600 signal and power supply connections that must be made to each chip. Since each chip may dissipate as much as 50 W of power, large thermal stresses can be set up in the connections made to the chip. When the system containing these MCMs is cycled on and off, these cycled stresses in the connections between the chip and the module can and do fail from fatigue. This requires that the thermal expansion coefficient of the substrate material of the MCM match that of the silicon chip. The problem is complicated by the fact that the substrate material dissipates much less I2R power than does the chip, which results in a short - term transient thermal mismatch between the chips and the MCM substrate. The use of Peltier-junction cooling close to the chip is worthy of consideration here. In addition, it should be noted that the thermal mismatch problem is not peculiar to MCMs, but is equally important with SCMs. The mismatch in thermal expansion coefficient between the MCM substrate material and the chip material is aggravated when the distance between the extreme farthest connections to the chip increases. Since there are 600 such connections, there will be an array of connections with about 25 connections on a side. If the chips are 1 cm on a side, then the array of connections must exist within the chip footprint, or within about 0.8 cm, which places connections on 0.32-mm (12.6-mil) centers. If the via holes in the insulating material, which contain the metal that connects to the wiring plane below are ~ mils in diameter, then the distance between the via walls is 7.6 mils with no tolerance. If there is an allowed tolerance or a guard band of 1 mil around each via, only 5.6 mils are left in which to place lines that conduct signals within the wiring plane. Clearly, something must be done to decrease the wiring congestion immediately under the chip. Today, this is done by the
OCR for page 38
38 addition of more wiring planes to redistribute the congestion to lower wiring levels (Blodgett, 19839. As more ECL circuits are crowded onto each chip, and since the system designers are reluctant for performance reasons to reduce the power needed per circuit, the power required and that must be dissipated per chip goes up. For reliability reasons , the maximum junction temperature on the chip must be limited to about 85 or 90°C. Assuming that the ECL circuits need 2 mW and that there are 20,000 circuits per chip, the power required per chip is 40 W. If the MCM contains 100 chips, the total power required by the MCM is 4000 W. If the power supply voltage is assumed to be about 2 volts, then the current required by the module is 2000 A. The MCM of 1994 with 2,000,000 circuits will require about 5000 signal and power pins. The idea of separating the pins that supply signals from those that supply power by connecting bus bars to the MCM for the power and a smaller number and size of pins for the signals is not workable, because there must be ground returns for the signal pins. There is no requirement in 1989 for a ground pin for each signal pin; the ratio of signal to ground pins today is about 4 to 1. In the 19 94 time frame, this ratio will probably decrease because of the steeper risetime of the signals expected at that time. The idea of separating the signal pins from the power pins is not a good one because at least the same number of signal pins is needed in each case. Certainly, smaller pins can be used if they carry only signal currents and not power-supply currents. It must be noted that a significant portion of the circuit current flows from other sources and in other directions, however, the current and pin problems are major issues to be dealt with. Removing 40 W of power from a chip is a very challenging problem, as is removing 4000 W from the MCM. Since Moms are quite expensive, they must be repairable. It must be possible to remove and replace chips on the MCM and to reroute signal paths, not only because of failure modes but because of the need for engineering changes. By 1994, engineering changes might possibly be made by external electrical signals. REFERENCES Baccarani, G., M. R. Wordeman, and R. H. Dennard. 1984. Generalized scaling theory and its application to 1/4 micrometer MOSFET Design. IEEE Trans. Electronic Devices, Errol . ED- 31, no . 4, pp . 452 -462 . Bakoglu, H. B. 1986. Circuit and System Performance Limits on VLSI: Interconnection and Packaging. Stanford Electronics Laboratories, Technical Report No. 541-4, Stanford University. Boudon, G , P. Mollier, J. P. Nuez, F. Wallart, A. Bhattacharyya, and S. Ogura. 1988. A 20K CMOS array with 200-ps gate delay. IEEE J. Solid- State Circuits, vol. SC-23, no. 5, pp. 1176-1181. Blodgett, A. J., Jr. 1983. Microelectronic packaging. Scientific American, vol. 249, no. 1, pp. 86-96.
OCR for page 39
39 Cong, H. I., J. M. Andrews, D. M. Boulin, S.-C. Fang, S. J. Hillenius, and J. A. Michej da. 1988. Multigigahertz CMOS dual modulus prescalar IC. IEEE J. Solid-State Circuits, vol. 23, no. 5, pp. 1189-1194. Dennard, R. H. 1986. Scaling limits of silicon VLSI technology. Pp. 352- 369 in The Physics and Fabrication of Microstructures and Microdevices, M. J. Kelly and C. Weisbuch, eds. New York: Springer-Verlag. Elmore, W. C. 1947. The transient response of damped linear networks with particular regard to wide-band amplifiers. J. Appl. Phys. vol. 19, no. 1, pp. 55-63. Ferry, D. K. 1985. Interconnection lengths and VLSI. IEEE Circuits and Devices Mag. vol. 1, no. 4, pp. 39-42. Kern, D. P., T. F. Kuech, M. M. Oprysko, A. Wagner, and D. E. Eastman. 1988. Future beam-controlled processing technologies for microelectronics. Science, vol. 241, August. Landman, B. S., and R. L. Russo. 1971. On a pin versus block relationship for partitions of logic graphs. IEEE Trans. Computers, vol. C-20, pp. 1469-179. Logue, J. C. 1966. Large-scale integration--Status and utilization. Electronica (Munich), October. McGroddy, J. C., and P. M. Solomon. 1982. Device technology comparison in the context of large scale digital applications. IEDM Technical Digest, pp. 2-5. Ning, T. H., and D. D. Tang. 1986. Bipolar Trends. Proc. IEEE, vol. 74, no. 12, pp. 1669-1677. Rubenstein, J., P. Penfield, Jr., and M. A. Horowitz. 1983. Signal delay in RC tree networks. IEEE Trans. Computer-Aided Design, vol. CAD-2, no. 3, pp. 202-211. Sai-Halasz, G. A., M. R. Wordeman, D. P. Kern, E. Ganin, S. Rishton, H. Y. Ng, D. S. Zicherman, D. Moy, T. P. H. Chang, and R. H. Dennard. 1987. Experimental technique for characterizing of IEEE International Electron Device Meeting, Technical Digest, pp. 397-400.
OCR for page 40
Representative terms from entire chapter: