Power systems are routinely designed and built to resist a variety of natural disruptions and continue to operate (NERC, 2006a,b). For example, they can often withstand, or rapidly recover from, events such as lightning strikes, wind and ice storms, fires, and various equipment malfunctions. Some of the features that have been designed into systems to enable them to withstand such “normal” events also offer protection against attacks of modest scale by terrorists. As the sophistication of various technologies grows, the evolving electric power system can be guided toward an even more resilient configuration.1
Simply adding generation and transmission capacity does not always make the system more robust. Furthermore, unless carefully planned, such additions can sometimes cause added congestion and decreased reliability in other parts of the system (Blumsack, 2006; Clark, 2004).
As described in Chapters 1 and 2, the nation’s electrical grid is highly stressed due to the growth in new generation and load without a concomitant increase in transmission capacity. An intelligently planned and well-coordinated terrorist attack could result in local or regional outages of significant duration and disrupt activities for a large segment of the population. The catastrophic failure caused by the 2005 hurricanes Katrina and Rita in several southern states resulted in widespread damage to system components, and it took several months to restore certain portions of the system.
If terrorist attacks targeted large critical components such as high-voltage transformers, for which spare parts are limited, restoration to pre-event levels of operation could take much longer (see Chapters 3 and 8). This chapter explores ways in which the electric system can be made resilient in the face of some attacks, and how any failures that do occur can be minimized. The reader also is referred to Chapter 3 for a discussion of physically protecting key facilities and Chapter 4 for cyber protection. Much of this chapter is necessarily technical, but the findings and recommendations at the end are intended to be understood without reading the entire chapter.
The chapter covers several technical topics:
1. Planning and operational design of the system to withstand simultaneous multiple outages;
2. Monitoring and protection systems, which play a critical role in mitigating the impact of an attack on the system;
3. Mechanisms to enhance the “graceful degradation” of the system in the event of an actual action or disturbance; and
4. Measures to increase the robustness and resilience of the distribution system2 through networked distribution system architecture and other means such as distributed generation.
Together, these types of system design and operational approaches can help to mitigate the effects of an attack, and may in fact make it less attractive to attack the electric system.
Interconnected bulk power systems3 are planned and operated in accordance with reliability criteria designed to ensure survivability following a range of plausible disturbances. The criteria are currently developed by NERC (as ERO) and regional reliability council processes (NERC, 2006a). Until recently, they have been voluntary but are
1For example, new technologies for diagnosis and control of disruptions and the widespread use of distributed generation could considerably strengthen the ability of the system to continue to provide service to most customers in the face of even fairly large-scale attacks (Benner and Russell, 2004).
2In the United States, distribution voltage is typically 4-34.5 k V.
3The term “bulk power system” generally applies to large central generation stations and those portions of the transmission system operated at voltages of 100 kV or higher.
becoming mandatory as a consequence of energy legislation enacted in 2005.
A key feature of the FERC-approved reliability standards is a performance table showing planning and operating criteria for normal operations (Category A) and three categories of disturbances.4 For single or multiple outages, the following apply:
5. Category B, events such as a short circuit causing loss of a single element or component in the system (i.e., an N-1 event with outage of a single generator, transmission line, or transformer).5 The power system must remain stable (no cascading) and within thermal and voltage limits. Loss of load or curtailment of firm transfer (i.e., sales of energy that have been agreed upon by contracts) is not allowed. For operations, the system must be readjusted within 30 minutes to withstand another outage.
6. Category C, certain related (non-independent) events causing outages of multiple elements. An outage of two circuits of a multiple circuit is one example. Similar performance to Category B is required except that planned/controlled load shedding and/or firm transfer curtailment are allowed. Cascading must be prevented.
7. Category D, extreme events resulting in multiple elements removed or cascading out of service. Selected events may be evaluated for risks and consequences.
To date, NERC standards have given little consideration to scenarios in which multiple facilities are destroyed by terrorists. In the future it may be prudent to design and operate bulk power systems to withstand multiple outages (Category D) that have some likelihood or history of occurring, or that are vulnerable to well-thought-out terrorist attacks. Such a standard would likely be expensive to implement and might reduce transfer capacity until additional facilities are added, but some movement in this direction is probably warranted.6
For Category D events, controls may be applied to prevent or mitigate cascading and massive loss of load. These are sometimes termed safety nets. For example, underfrequency load shedding is universally applied for controlled or uncontrolled separations (islanding). Undervoltage load shedding may be applied in areas where voltage collapse is a concern. These and other automatic controls attempt to restore equilibrium conditions within the electric power system or portions thereof. Loss of components due to malicious attacks would also cause imbalances and, if necessary, such controls would also be activated to mitigate the detrimental effects.
According to the NERC performance table, actions such as reduced power transfers and canceling of planned outages (e.g., for maintenance) may occur during abnormal conditions such as storms or forest fires. Similar actions should be taken during elevated terrorist threats resulting in a DHS red alert status.
The U.S.-Canadian power system currently consists of four large regions (see Chapter 2) within which all connected generators operate synchronously. Asynchronous connections between the regions are accomplished with DC tie lines or back-to-back AC-DC-AC converters (asynchronous links). Large synchronous regions evolved for economic power transfers and for the mutual support inherent with AC transmission.7 Under some operating conditions, however, large synchronous interconnections are vulnerable to large cascading failures when certain faults occur.8 (For examples, see Table 1.1.) Upgrades of AC transmission capability to improve the strength of the existing interconnections, the selective addition of advanced controls, and power electronics-based equipment, and other solutions such as prioritized modernization of power plant and substation equipment, including emergency control and protection are urgently needed.9
A critical component of the bulk power system is the design and layout of transmission substations and switch-yards. Substations are designed for reliability, flexibility of operation (including access), and cost. Substations provide the ability to safely switch equipment out of service during either scheduled or unscheduled outages while maintaining service. Several substation configurations have evolved to
4See Table 1 at http://www.nerc.com/pub/sys/all_updl/standards/rs/TPL-001-0.pdf.
5Simulations for N-1 planning/operating criteria often involve a rarely occurring three-phase fault at a critical location with outage of a key line or transformer during peak load or transfer. The three-phase fault “umbrella” events are more severe than many multiple outages, especially those occurring during less stressed (off peak) operating conditions.
6Recall also that an N-2 event is defined as one in which the system would continue to operate reliably without two elements. Note, however, that there is no requirement for the frequent N-2 event of a short circuit with line outage, and with simultaneous outage of a parallel line or line with common termination because of a protective relay mis-operation. Storms, fires, airplanes, and terrorists may also cause loss of parallel lines on the same right-of-way. However, moving to N-n reliability standards in which n is larger than 2 should only be undertaken after a careful quantitative probabilistic assessment of costs and benefits.
7For example, for an outage in one line, power automatically shifts to other parallel lines in a fraction of a second. With DC links, special controls are needed.
8One theoretically possible approach to containing the extent of such outages would be to reduce the size of synchronous regions. For example, the large Eastern and Western interconnections could be restructured into regions similar in size to the Quebec and the Texas interconnections. This would require breaking up these two large interconnections into smaller ones connected by asynchronous links. Such a change would prevent the propagation of disturbances across very large areas. However, this approach would have serious limitations. It would undermine the kind of automatic support now provided by a large interconnected AC grid when large loads or generators are tripped. Further, asynchronous links are expensive.
9Such control equipment may include selective conversion to asynchronous links, such as a link proposed between Ontario and Michigan that might have reduced the extent of the August 14, 2003, blackout.
achieve reliability and flexibility. The configurations consist of different bus and circuit breaker schemes which, when switched, provide alternate network paths.10 The bus configurations could have a significant impact on maintaining reliability in the event of a malicious attack on the power system, especially if a transformer, circuit breaker, instrument transformer, or bus work fails violently. For example, a buswork or circuit breaker failure can cause complete substation outage with one bus configuration, but no loss of connectivity with another. Appendix F compares four common bus configurations and indicates their relative advantages and disadvantages. Older, usually lower-voltage, configurations and protection schemes tend to be less reliable.11
Whether it is caused by a terrorist attack or some natural cause, once a transmission or substation short circuit has occurred, circuit breakers must interrupt tens of thousands of amperes to isolate the faulty equipment and protect equipment that is not yet damaged. If the circuit breaker fails, additional breakers may be required to open, and, depending on bus configuration, may cause outage of multiple additional lines and transformers. Furthermore, a circuit breaker failure may be explosive, damaging nearby equipment and causing a fire. Breaker failure protective relaying is often nonredundant or may not be installed, potentially resulting in even larger disruption and possible cascading blackout. Breaker failures have initiated large-scale power interruptions.
Modern circuit breaker technologies are available to replace underrated or unreliable breakers.12 Prioritization of breaker replacements is relatively straightforward, and, as budgets permit, power companies replace underrated breakers. Prioritization is based on breaker type and reliability, interrupting rating relative to short circuit currents, bus configuration, and the potential system impact of a failure. Difficulties with cost recovery must be overcome in order for such modernization to occur.
For major new transmission line construction, it may be preferable to construct new substations rather than enlarging existing substations to a size that jeopardizes reliability if those substations completely shut down. Likewise, bypassing substations in a hopscotch fashion along a multi-line transmission path reduces the effect of a complete substation shutdown, and reduces choke points.
The electric power system consists of expensive generators, apparatus, and lines that can quickly be damaged or destroyed as a result of short circuits (faults), thermal overload, or other abnormal conditions. Protection systems are designed to automatically detect and isolate lines and apparatus following electrical faults or disturbances in order to protect equipment from damage due to voltage, current, or frequency excursions outside the design limits. Primary protection devices include relays, reclosers, fuses, circuit breakers, and switches. In response to short circuits, protective relays detect abnormal electrical signals and open circuit breakers to isolate faulty equipment.13
Protection systems are critical to ensuring safe and reliable operation of interconnected transmission networks and should have the characteristics shown in Figure 6.1. A protection system must be dependable and secure in all its operations. Dependability means that protection devices properly respond when changes in electrical conditions indicate an abnormal or dangerous condition. Security means that protection systems will not mis-operate under normal conditions or for conditions outside the operational design of the protection system. Usually an increase in system dependability means a decrease in security or vice versa. For example, protection system dependability can be enhanced by incorporating device redundancy. Increased redundancy through the use of multiple relays to monitor a transmission line for abnormal conditions improves the probability that an event will be detected and thus improves reliability. However, multiple relays acting in parallel can also decrease security through greater complexity and greater exposure to component failure and mis-operation. Consequently, reliability requires a fine balance between dependable operation and security against inappropriate operations.
Many design issues and approaches can affect the characteristics of protection and control systems, including the following:
10Most switchyards and substations have open-air bus work. At much higher cost, bus work may be placed in pipes insulated with SF6, rather than open air. Switchgear is incorporated in the gas-insulated equipment. The substation is then much more compact and can be installed indoors or underground. Gas-insulated substations are commonly used in urban areas, particularly in Europe and Japan, where land prices are high. Obviously, stations that are indoor or underground can be more secure against attacks.
11As an example, a bus fault at an old 400-kV substation led to a massive cascading power system blackout in Brazil on March 11, 1999. Lack of local bus protection and an unnecessary zone 3 relay operation at another station contributed to the failure. Following the blackout, potential system improvements were prioritized considering risk to the system, cost, and other factors. Many of the changes involved relatively low cost substation configuration improvements, and protection modernization.
12A recent Fitch report states that 60 percent of circuit breakers in the bulk power system are now more than 30 years old (Anderson et al., 2006). Many may be underrated or marginally rated for present day short-circuit currents. Modern circuit breakers are technically superior and much more reliable, and are available at about the same cost as old circuit breakers, despite general inflation.
13In the August 14, 2003, blackout event, lines sagging into trees caused a short-circuit current that was detected by relays and cleared by proper operation of breakers. The transmission line remained undamaged and capable of being placed into service. In other words, the protection devices correctly operated in response to faults caused by external factors (i.e., contact with trees). However, in that case, successive loss of multiple lines due to short-circuit or overload conditions resulted in instability and successive protection system operations that ultimately gave rise to a cascading failure and a blackout.
• Speed at which protection systems operate. A rapid decision to trip a breaker may prevent instability and permanent damage to lines or apparatus under fault conditions. However, disturbances and system dynamics may create electrical signals that emulate fault or overload conditions that can only be distinguished with sufficient analysis time. Consequently, a quick decision to trip may be required under certain conditions, but also may result in an improper decision under different dynamic conditions.
• Testing and maintenance practices. These can result in improper protection settings or inadvertent changes in protection logic. These have also caused large-scale blackouts. For example, even a cursory analysis of the August 2003 blackout shows several areas of concern with respect to protection system design, as integrated with system operations and communications. The loss of the first transmission line was caused by the correct operation of relays to clear a fault caused by the line sagging into trees. This resulted in heavier loading of parallel lines with the effect of subsequent loss of multiple lines due to faults and overload conditions. The lines associated with these events were properly protected and preserved and could have immediately been placed back in operation had operators had adequate knowledge and awareness of the dynamic events that were occurring.
• Systems to enhance awareness of operating conditions. New digital relays with advanced communications and information sharing capability coupled to control and information systems can decrease the probability of cascading failures as a consequence of multiple protection system operations.
• Proper settings of relays. Improper settings have resulted in cascading blackouts caused by the tripping of transmission lines under nonfault conditions. An improper setting of a “zone 3” impedance (distance) relay was a proximate cause of the November 9, 1965, Northeast blackout. The relay performed correctly based on its setting, but it had not been reset as system load grew. High load but nonfault electrical conditions caused the relay to operate. Emphasis should be given to remote monitoring of protective relay settings and improving maintenance and test procedures that mitigate the possibility of improper and insecure operation of relays.
• Addressing the “overreach” of protection systems. Overreaching distance protection, mainly in the form of zone 3 relays, has caused or contributed to many blackouts. Overreaching protection is generally applied as backup protection in the case of breaker failure in a distant substation. In other words, if a local protection system fails to detect a fault, surrounding substations “overreach” to detect the fault and eliminate fault current in-feeds to the local substation. Sensitive settings are required, and so the relays are prone to operate on nonfault conditions of overload, depressed voltage, or electromechanical swings among generators. There are several solutions to this problem, including redundant local relays, breaker failure relays, bus protection, and restrictions on the reach of impedance relays. NERC and the industry have addressed the backup relay problem in response to the 2003 blackout. Thousands of changes were made by North American power companies. (Reports are available at www.nerc.com.) Eternal vigilance, however, is required to ensure that relays respond only to short circuits.
The above approaches do not address all of the protection issues that can cause or exacerbate a cascading blackout. With millions of protective relays and protection schemes in place, undesirable or unnecessary operations cannot be prevented. However, fruitful areas of investigation and improvement include the following:
• Improvement in intelligent, digital relays allowing for self-evaluation and remote evaluation of settings and relay health to ensure reliable operation.
• Integration of protection systems with other control and operation systems to ensure that operators have full operational awareness as conditions change and deteriorate during a cascading event.
• Improved control philosophies and strategies for multiple contingency events occurring in close time proximity. Such improvements could address situations in which the proper operations of relays in response to changing conditions, when taken as a whole, can create unrecoverable instability in the power system.
• Methods to prioritize modernization of protection relays and schemes, including communications such as by fiber optics between stations.
As noted in Chapter 4, supervisory control and data acquisition (SCADA) provides two-way communication and control capability for control centers, power plants, and substations. Every few seconds, control centers receive massive amounts of data, most of it reflecting electrical conditions across the grid. However, determining what it all means, and what exactly happened following a natural event or a terrorist attack, may be difficult.
There are various sensor-related strategies to improve the situation. One, for example, is to increase the amount of data by using and analyzing data from a large number of distributed sensors.14 These enable detection of potential intrusions and sabotage, and postmortem studies after failures. Although it is very difficult to avoid or predict terrorist acts, quick assessment of the situation can help operators take actions in order to avoid cascading events and the consequent partial or total blackouts.
The mechanical failures resulting from malicious attacks on a transmission line are similar to extreme natural events affecting a transmission line. Thus, work done in the latter area can also help to guide preventive and corrective action for acts of sabotage. A basic method to assess damage caused by any physical event on the transmission grid is visual inspection, but this is difficult for transmission lines dispersed over hundreds of kilometers.15
Various techniques can address this issue. For example, digital distance relays can report approximate fault location based on the impedance calculation for a fault. Transmission fault locator devices based on traveling wave propagation or other methods can more precisely determine fault location. Real-time determination of the fault location (e.g., as a percentage of line length), and then communication of this information to the control rooms and reliability coordinators, allows the operators to take appropriate control actions, and if terrorism is suspected to quickly alert law enforcement about the exact location of the problem. The mapping of the fault location as a percentage of line length to a particular geographic location is usually straightforward, provided that global information system models of the line are available. Single-phase switching or three-phase automatic reclosing attempts provide information on the type of fault and whether it is transient (e.g., lightning caused) or permanent. In situations where information is limited, operators in control centers may attempt manual (SCADA) line reclosure to determine if the fault is permanent. For permanent faults, crews are dispatched, possibly including aircraft for visual inspection.
Monitoring the structural integrity of transmission lines is helpful in assessing the effects of mechanical events. Equipped with adequate cryptographic and security features, wireless sensors for collecting structural information can provide a seamless sensing environment thanks to their main characteristics: ease of installation and replacement, low cost, networking, and small size.
Innovative technologies should be employed for detection of failures in power systems before they become catastrophes. Novel approaches that involve the implementation of a sensor network design for the national electric energy infrastructure combined with the use of nonconventional mechanical sensors may significantly improve the robustness of power systems against catastrophic failures. This would include wireless sensor network technology for detection of mechanical failures in transmission lines, such as conductor failure, tower collapses, hot spots, extreme mechanical conditions, and so on. It also involves the installation of mechanical sensors in predetermined towers of a transmission line, communicating via a wireless network. Sensors include accelerometers, tension/strain gauges, and tilt and temperature sensors. The main goal is to obtain a complete physical and electrical picture of the power system in real time and determine appropriate control measures that could be automatically taken and/or suggested to the system operators.
A variety of nontraditional sensors should also be considered and evaluated. These include sensors for mechanical motion; sound; visual spectrum (e.g., closed-circuit television and automatic processing of closed-circuit television signals); infrared; chemical, gas, ozone, nitrate, CO, and CO2 sensors; electromagnetic radiation, Poynting vector (based on electric and magnetic fields), partial discharge detectors; biological sensors; conduit continuity/resistance; incipient fault detection; and vibration. Also, the use of unmanned aerial vehicles (UAVs) could be considered. Sensor additions will require new software to process (filter and prioritize) the data for presentation to operators who may already be overwhelmed with data and alarms following events.
While there has been much discussion regarding the actions of operators, particularly after the August 14, 2003, failure, terrorist attacks and other disturbances can evolve into instability in a few seconds or tens of seconds, in many cases too fast for operators to determine what is happening and take appropriate corrective actions. During certain relatively familiar events in which alarms become activated, operators may act within a few minutes. In new situations, 15 to 30 minutes may be required for assessment and operator
14These might include nonconventional sensors and innovative instrumentation located in the power system by some prioritized strategy. Metrics include system observability, power usage, enhancement of communication capabilities, and size of data for operations and enhanced operational decision making.
15Problems occurring in concentrated environments (substations or generating plants) are not difficult to find and assess with a small crew, or through video surveillance. Recent blackouts in the United States and Italy have shown that failure to assess and understand the condition of the power system, and the delay in taking appropriate corrective actions after just a single outage, can lead to blackouts across large areas.
FIGURE 6.2 Power system stability controls. SPS, special protection systems; WACS, wide-area stability and voltage control system.
actions, especially if load shedding is required. Thus, various types of automatic controls are required.
The following are some of the examples of automatic controls the committee has identified:
• Techniques for shedding load and generation to enhance power system dynamic response capabilities, including simple and low-cost approaches to avoiding voltage collapse;
• for maintaining proper transmission network voltage profiles;
• Primary automatic controls to prevent cascading instability that are located mainly at power plants;
• Transmission-level power electronic devices and mechanical devices;
• Local load-shedding practices and techniques;
• A class of controls termed special protection systems (SPSs) or remedial action schemes;
• Wide-area feedback/response-based controls, either continuous or discontinuous; and
• Sophisticated control algorithms (using various techniques such as adaptive or “intelligent” control) as part of digital control and communication capabilities.
Appendix G provides further descriptive details concerning each of these types of controls. Figure 6.2 illustrates a possible configuration of power system stability controls. The special protection systems path is feedforward. The continuous feedback controls are normally local and mainly at generation facilities, but could be wide area. The feedback (response-based) discontinuous controls are often wide area, but could be local (e.g., underfrequency or undervoltage load shedding).
In summary, power system robustness, resilience, and survivability in the face of major disturbances, including modest terrorist attacks, can be increased significantly and economically through the use of automatic controls. What is required is implementation of industry best practices, prioritized upgrading of old analog controls, and development and implementation of wide-area controls.
In North America, the bulk power system is monitored and managed at energy control centers, also called SCADA-EMSs or simply energy management systems (EMSs). Data acquisition and remote control are performed by computer systems called SCADA systems. Figure 6.3 shows a schematic of a modern EMS. Note that a SCADA system communicates with generating plants, substations, and other remote devices.
Because of the historical evolution of the electric utilities in the different geographic regions, these EMSs are functionally similar but not identical. All these different EMSs result in significant additional complexity. Of the four synchronous interconnections in North America, the Quebec and Texas interconnections each constitute a “balancing areaarea”-an organizational jurisdiction responsible for balancing its load and generation and each requiring its own EMS with automatic generation control. By contrast, the two other interconnections (the Western and Eastern) are too large to have only one balancing area each and, instead, have dozens of them.16 With so many EMSs in these two interconnections, it is difficult to monitor all that is happening in a large interconnection, and so reliability coordinators or independent system operators that coordinate large portions of the interconnection have been set up and sometimes have
16The Eastern Interconnection has about 100 and the Western about 40, with the numbers fluctuating over time as organizational jurisdictions change. Note that some balancing areas in these two interconnections are so large that the EMS is hierarchical, with some of the functions distributed over several control centers.
Thus, the control center EMSs that represent the balancing areas have the most control of the grid, but each can control only a small portion of the Western or the Eastern Interconnection. The reliability coordinators have a wider view of the grid but no coordinator covers the whole Western or Eastern Interconnection, and coordinators do not always have direct control of their portion of the grid. No single entity has the full real-time view of either the Western or the Eastern Interconnections, but some balancing authorities and reliability coordinators do exchange real-time data with their neighbors to increase their situational awareness beyond their own borders. More such data exchange will be needed and even a central monitoring center for these large interconnections has been suggested in the 2005 EPAct and elaborated further by USDOE and FERC (DOE/FERC, 2006).
Because the balancing area control centers have the ability to switch breakers and control other parameters, these could be main targets for cyber attacks.17 Historically, the communication systems between these EMSs and remote terminal units (RTUs), and between EMSs, have been dedicated redundant channels and are not paths for intrusion. However, connections between the EMSs and other information systems have increased in recent years, and such connections need to be secured and made trustworthy.
Although some automatic controls, like automatic generation control, are part of an EMS, the main function of the EMS is to allow the operator to monitor the present condition of the system (including alarming and analysis of the present conditions) and to take manual control actions as necessary to reliably operate the grid. Because the final cascading, like that in the 2003 Northeast blackout, can happen too fast for the operator to intervene, it is important for the operator (with the help of the EMS software) to recognize developing patterns that endanger the system. An operator in an EMS can observe the electrical performance of the system and take appropriate actions. However, neither the operator nor the
FIGURE 6.4 Balancing areas (also called control areas). For definitions of acronyms, see Appendix D. SOURCE: NERC. Available at http://www.nerc.com/regional/NERC_Regions_BA.jpg. Accessed October 2007.
automated control system can distinguish between a physical disruption in the system and an electrical disturbances (e.g., if the base of a transmission tower is bombed and the line goes down causing a contact with the ground, the circuit breakers will operate to isolate the transmission line from the rest of the system). For an operator in the control center, the primary indication is that the circuit breakers operated to open and isolate the transmission line. The operator, however, cannot distinguish whether this is a temporary situation or a permanent one. If this information was available, then the operator in all probability would make decisions to maneuver the system to a more secure state. The ability to provide this additional information is the primary focus of the steps needed to protect, mitigate, and enhance graceful degradation. In order to facilitate these steps, various initiatives would be needed to harden the system against malicious disruptions. These steps are outlined and discussed below.
Especially after the 2003 U.S.-Canada blackout, the “situational awareness” of the operator has emerged as a major concern. Operators at the EMS where the power system conditions were deteriorating were not aware of these conditions. Although trending and alarming for limit violations and abnormal conditions of individual measurements are commonplace in control centers, the recognition that abnormal patterns are developing (e.g., the depression of voltage over a large region as opposed to voltage limit violations at individual buses) is dependent on the experience and alertness of the operator. Automatic capture of such disturbing trends by the EMS computers would be an enormous help to alarm the operator. Such alarm processing using advanced methods of pattern recognition is needed.18 It also would be valuable to coordinate, in real-time, the display of line outage information across reliability coordinator boundaries. If a group of terrorists were to strike a number of electrical targets distributed across a large geographic region, the sooner the malicious nature of the event was uncovered, the quickly protective actions could be taken. Currently there is only limited sharing of real-time information across reliability coordinator boundaries (Figure 6.5), with no one seeing the big picture for a grid such as the Eastern Interconnection. Hence, there would likely be a delay in determining that the near simultaneous loss of multiple lines in multiple regions was likely due to malicious activity.
Just as redundancies are needed in the design of the power grid to increase its reliability and its ability to withstand physical attacks, so also are redundancies needed in the EMS, in both the hardware and the software, to ensure reliability of this critical function. Redundancies in the communication channels to the RTUs and redundancies in the computer hardware (including automatic checkpointing and failover) have been common practice. Redundancies in software and its graceful degradation have been less common. The loss of the alarming system in a key EMS during
18For example, it is likely that multiple attacks on the transmission system will not occur precisely simultaneously even if planned that way. Even small differences in the time of failures could give important indications that an attack is occurring and allow remedial actions before the full effect of multiple failures would be felt.
the 2003 U.S.-Canada blackout was a critical element in the operator not being aware of the deteriorating conditions in the power system. Better design of software redundancy and degradation should be a critical part of EMS design, as discussed in Chapter 4.
In addition to technology improvements, it is necessary to ensure that the operators themselves have the training to understand and deal with rapidly deteriorating situations. High-quality system simulators are now available to train operators to understand and manage complex disruptions of the transmission system. Much greater and more uniform use should be made of such systems during the training of system operators.
Another area where there are design and operational strategies to mitigate the effect of attacks is the engineering of the distribution system. Once electric power has been transmitted in bulk over transmission lines, it is delivered to distribution or bulk power delivery substations where it is distributed to customers. Distribution substations consist of multiple step-down transformers that reduce the relatively high voltage of transmission lines to lower distribution voltages. Although some large industrial customers take electric power at higher voltages, more than 90 percent of all the electric power distributed in the United States is delivered at less than 15,000 volts.
The majority of distribution subsystems in the United States consist of overhead feeders typified by the common wood pole construction and pole-mounted transformers found in rural and most urban areas. A growing number of distribution customers are served by underground cables. Whether built as overhead lines or with underground cable, the majority of distribution is of a radial “single-feed” nature, meaning that the loss of the distribution feeder results in a customer interruption, since there is no alternative source of power.
Conventional overhead lines in a radial configuration usually are the least expensive way to distribute electric power to customers. However, overhead lines are vulnerable to natural and man-made attack. While any one line can be repaired quickly, multiple outages, such as after a hurricane, can result in long periods of service interruption. The use of underground cable, multiple feeds to the customer with automatic switching, loop circuits whereby customers can be switched from one feeder to the next, and other forms of redundancy significantly improve reliability at additional expense. In the case of critical loads such as a manufacturing facility or a hospital, distribution designers often provide a twin or dual feed, namely, an alternative feeder that provides redundancy in case the primary feeder is lost. Obviously, the cost to
provide such redundancy makes similar wholesale structural changes to the existing distribution systems unlikely.
Some use is made of “network” distribution, primarily in high-density urban areas. The low-voltage outputs of multiple distribution transformers are connected to create a network to which customers are attached. This inherently creates multiple feeds to customers. While these networks are more complex to operate than a simple radial distribution, they have certain advantages in both efficiency and reliability. The cost is greater than radial distribution but can be generally justified for serving the dense loads of a downtown area.
The loss of a distribution feeder results in the immediate loss of electric power to several hundred to several thousand customers—but such a disruption is often relatively small in the context of the entire utility system. Distribution will most likely be subject to physical attack when specific customers or critical industry are targeted. The distribution apparatus used today is operationally rugged and relatively easy to repair, but because the distribution system is rarely monitored, the only notice the utility receives that power has been lost to a customer is the customer calling to complain. Often distribution power outages last for several hours simply because the utility is initially unaware of the problem, and then it takes substantial time to dispatch the repair crew to locate a fault and identify and replace damaged equipment.
Through the use of automated distribution, significant opportunities exist to improve the reliability of electric power distribution without rebuilding the existing distribution system. In general, these include:
• Automation of distribution systems, including SCADA systems. This approach consists of the use of advanced sensors with communications infrastructure so that an electric utility can monitor and remotely control distribution. SCADA systems as part of distribution substations allow electric utility dispatchers to monitor feeder information, such as voltage level and feeder loading, with the coincident ability to open and close feeder breakers remotely. Systems for automated distribution and control can be incrementally introduced and are already in place in some parts of the country. Compelling arguments concerning economic development can be advanced for at least some such improvements, since distribution-system disturbances account for most of the power outages experienced by customers. State regulators should require local companies engaged in distribution to undertake studies that explore the potential benefits and costs of such upgrades, and then to mount programs of improvement that have clear positive net benefits.
• Use of RTUs scattered throughout the distribution system. Such systems would be installed at the feeder level, allowing a distribution dispatcher to sectionalize a feeder or perform switching operations to restore power by isolating faults. This action restores power to a large number of customers, minimizing the duration of an outage by quickly locating and isolating the faulted section. New developments include automated sectionalizing and restoration of healthy feeder sections, after a fault, using intelligent, distributed RTUs.
• Advanced communication systems. Advanced communications systems are being introduced into distribution systems, including radio and cell communications, to acquire data and to control remote devices. The distribution feeder itself is used as a communication medium in power-line communication systems. As communications improve, the functionality and the complexity of distribution automation grow.
• Other advances in distribution automation. These include the use of intelligent electronic devices, automatic meter reading, and continuous high-frequency monitoring of distribution feeders to identify the incipient failure of distribution equipment and to detect very-low-current, arcing faults. If failing equipment can be detected and repaired or replaced before catastrophic failure, the number and length of outages can be reduced. Computer-based intelligent electronic devices can be applied to monitor and protect distribution feeders, resulting in a wealth of information that supports system restoration and improved reliability (Benner and Russell, 2004; EPRI, 2005).
It is obvious that the existing electric distribution system in the United States is vulnerable to attack because it is highly distributed geographically. But the huge investment already made in electric distribution makes significant structural changes both expensive and long term. Consequently, efforts must focus on maintaining the health and robustness of distribution with an emphasis on restoring power after outages and maintaining the continuity of electric service to critical customers.
Over the next decade, efforts can prudently be concentrated on the following areas:
1. Critical customers should be identified and specific attention given to ensuring service continuity and maintenance of critical functions during a terrorist attack. This level of protection can be accomplished by providing multiple power feeds to distribution customers and by providing onsite generation in case of the loss of bulk transmission. Recent experiences in large-scale blackouts have shown that many critical loads are vulnerable and do not have adequate auxiliary power backup.
2. Distribution automation can be applied at reasonable cost, significantly improving the reliability of
distribution and making system restoration more deterministic and rapid. Emphasis should be given to applying improved SCADA, intelligent electronic devices, advanced communication, and sophisticated (broad-bandwidth) monitoring that provide continuous control and high-quality data concerning the operation of distribution. These devices can provide immediate notice of an outage, confirmation of the cause of the outage, and the specific information necessary to restore service as rapidly as possible.
3. Robust distribution is needed, which requires careful attention to system upgrades and maintenance. Distribution systems operating at close to design limits or systems operating with degraded equipment fail more easily and make restoration of service more difficult. Consideration should be given to the applications that monitor and diagnose the health and robustness of distribution, and to supporting condition-based maintenance and repair. Such continual maintenance also provides the opportunity for upgrading not just to new power equipment but also to the distribution automation technologies mentioned above.
One way to mitigate the effects of attacks on the electric power delivery system is to make end uses more resilient, as well as capable of operations when disconnected from the grid. Distributed generation refers to the use of relatively small generators spread throughout the electrical system, and typically connected at distribution primary voltages, or perhaps at the subtransmission level. The generators may be operated either by a utility or by other parties that have connected to the grid. Although widely used in some parts of Europe, such as the Netherlands, distributed generation has been slow to develop in the United States.
Because of the economics, regulatory barriers, and other factors, the technology has not really expanded yet, but there is a prospect for widespread use of distributed generation. Because there are now so many types of distributed generation systems,19 as their use becomes more widespread, they should be introduced in a way that aligns with—rather than undermines—key Institute of Electrical and Electronic Engineers (IEEE) standards (Standards 1159, Recommended Practice for Monitoring Electric Power Quality and 1547, Interconnecting Distributed Resources with the Electric Power System). Some of the key technical issues in integrating distributed generation systems into the grid are as follows.
• Distributed generation at substations. The placement of distributed generation at transmission/distribution substations has been used in the past to provide emergency power. There are proposals to increase the level of distributed generation at substations to take advantage of the space and facilities at many of these substations.
• IEEE standards, recommended practice, and guides for emergency power generation20 (Daley and Siciliano, 2003a,b; Davis and Stratford, 1988; IEEE, 1987), and certain other specialized systems.21 These standardized procedures are largely in place in commercial and industrial applications in the United States today. At the time this report was prepared, there were no recommended practices for residential systems.
• Back-up power installation. The technology of back-up power is well known and commercialized. The appropriate IEEE standards for emergency and standby power technology are IEEE 446 (IEEE Recommended Practice for Emergency and Standby Power Systems for Industrial and Commercial Applications) and 141 (Recommended Practice For Electric Power Distribution for Industrial Plants).
• Considerable volume of material on case studies for distributed generation. A sampling in the literature of materials that relate to the potential of this technology, especially in the arena of emergency supply, include Ault et al. (2000, 2003). Daly and Morrison (2001), and Golshan and Areffar (2006). In Daley and Siciliano (2003b), the specific case is made for distributed generation for emergencies. In Dugan et al. (2001), some cautions are outlined for cases of high penetration (i.e., high installed power levels) of distributed generation.
• Safety. Perhaps the greatest fear in installing distributed generation is the safety issue of circuits being fed from the load end (Dugan et al., 2001). During restoration of power after large disturbances, this safety issue could be very important (Barker and De Mello, 2000; Caire et al., 2002).
• Interest in renewable energy sources to alleviate dependence on natural resources. Renewable sources appear to be well suited for low-power scenarios, and the public acceptance of these sources is high. If the economics can be made favorable, there is a real prospect for the increased use of renewable sources.
19Some distributed generation is categorized as 60-Hz synchronous generation and its conventional controls. Other distributed generation may be interfaced with the distribution system through an electronic converter. Penetration levels in the time span 2006” 010 are not expected to exceed 10 percent of the total demand. However, localized high penetration levels may occur.
20Additional recent developments for emergency generation are discussed in Daley and Siciliano (2003a,b) and in Davis and Stratford (1988).
21IEEE Standard 141 (1986)—Recommended practice for electric power distribution for industrial plants; IEEE Standard 241 (1983)—Recommended practice for electric power systems in commercial buildings; IEEE Standard 493 (1980)—Recommended practice for the design of reliable industrial and commercial power systems; and IEEE Standard 602 (1986)—Recommended practice for electric systems in health care facilities.
The main nonhydroelectric renewable source is wind power. Photovoltaic panels coupled with battery storage have considerable potential for distributed generation as prices drop.
• Energy storage to allow for increased use of renew ables and to improve resiliency of the entire grid. Improving the system load factor and utilizing renewable sources that are time and weather dependent require the use of energy storage. Prospects include batteries, pumped storage, compressed-air storage, and supercapacitors.
Findings on the Transmission Network-Short to Medium Term
Finding 6.1 Any increase in the reliability of the power grid makes the system more capable of withstanding terrorist attacks, more able to mitigate the impacts of such, and less interesting as a target of terrorists.
Finding 6.2 In many cases, increased performance of the electric power system may be achieved through stronger ERO reliability criteria and additional controls such as special protection systems. For example, the ERO and FERC could require NERC Category C performance for the common N-2 event of a short circuit on a line with line outage, and with simultaneous outage of a parallel line or line with common termination because of protective relay misoperation. Meeting this requirement would improve system robustness and help protect against terrorist actions on lines on the same right-of-way. As an example of new operating procedures, the DHS red-alert condition could require more conservative system operation similar to storm-watch procedures.
Finding 6.3 The robustness and resilience of power systems can be significantly improved by prioritized modernization of power plant and transmission infrastructure and deployment of technological advancements. Many power plant and substation enhancements can be rapidly implemented at low cost compared to the construction of new transmission lines. Potential upgrades include modern circuit protection systems, communications, generator excitation equipment, and shunt capacitor banks to increase generator reactive power reserve.
Finding 6.4 The control center is the nerve center of the power system, and its resiliency is extremely important. The computer hardware and software in the EMS should be designed to withstand failures and to degrade gracefully when necessary. The control center as a whole must be protected from physical as well as cyber attacks, and a backup control center should be available. Adjacent control centers (e.g., PJM Interconnection and Midwest Independent Transmission System Operator [MISO]) should partially back each other up.
Finding 6.5 Much greater and more uniform use should be made of simulators during the training of electric power system operators.
Finding 6.6 Undesirable and unnecessary operations of protective relays during power system disturbances have contributed to many cascading power failures. These relays are intended to detect short circuits or other specific conditions in a protection zone, but can operate inappropriately during other conditions such as overload and/or voltage sag. While commendable industry-wide improvements were implemented following the August 14, 2003, blackout, continual vigilance and careful design are required. Coordination among various control and protection devices is essential to system reliability.
Findings on Transmission Research and Other Long-term Needs
Finding 6.7 The electric power transmission system should move toward large-scale use of sensors that provide a complete physical and electrical picture of the power system in real time, and appropriate control measures that could be taken automatically and rapidly or suggested to system operators. Research needed to make such a system a reality is discussed in Chapter 9. With today’s digital control and communication capabilities, there are many opportunities for application of sophisticated local, distributed, and high-level control algorithms using various techniques such as adaptive or “intelligent” control coupled with wide-area measurements and adaptive islanding.
Finding 6.8 Improved intelligent, digital relays are needed that allow for self-evaluation and remote evaluation of settings and status to ensure reliable operation.
Finding 6.9 Improved control philosophies and strategies are needed for multiple contingency events occurring in close time proximity. The proper operations of relays in response to changing conditions, when taken as a whole, can create unrecoverable instability in the power system.
Finding 6.10 Consideration should be given to redesigning some critical substations using buswork in pipes insulated with SF6 with switchgear incorporated in the gas-insulated equipment. This approach allows more compact substation design, and the critical facility could then be relocated indoors or underground to provide more security against attacks.
Finding 6.11 As advanced storage technologies become available, strategies should be explored to use them to increase the performance and the resiliency of power systems.
Findings on the Distribution System
Finding 6.12 Being able to reduce load, and to focus on serving critical customers, can make the power delivery system far more robust in the face of natural disruption or terrorist attack. In many distribution systems, it is currently difficult or impossible to serve only a subset of customers on a distribution feeder. However, the technology is readily available to facilitate such selective service through distribution automation and intelligent load shedding.
Finding 6.13 Distribution systems operating at close to their design limits or systems operating with degraded equipment fail more easily and make restoration of service more difficult. State regulators should require distribution companies to assess the status of their systems and, where appropriate, require the installation of systems that monitor and diagnose the health and robustness of distribution, and support condition-based maintenance and repair. Systems that are operating with adequate capacity margins, and with all apparatus in good condition, are clearly more robust in the face of attacks or outages.
Finding 6.14 Greater use of automated distribution and load-shedding management holds the potential to reduce the vulnerability of the existing power system. Increased deployment of distributed generation and planning for the use of these facilities in the event of contingencies could greatly reduce the impact of an extended outage. Most of the needed technology for these concepts already exists.
Recommendation 6.1 The electric reliability organization (ERO) should require power companies to reexamine their critical substations to identify serious vulnerabilities to terrorist attack. Where such vulnerabilities are discovered, physical and cyber protection should be applied. In addition, the design of these substations should be modified with the goal of making them more flexible to allow for efficient reconfiguration in the event of a malicious attack on the power system. The bus configurations in these substations could have a significant impact on maintaining reliability in the event of a malicious attack on the power system. Bus layout or configuration could be a significant factor if a transformer, circuit breaker, instrument transformer, or bus work is blown up, possibly damaging nearby equipment.
Recommendation 6.2 The ERO and FERC should direct greater attention to vulnerability to multiple outages (e.g., N-2) planned by an intelligent adversary. In cases where major, long-term outages are possible, reinforcements should be considered as long as costs are commensurate with the reduction of vulnerability and other possible benefits.
Recommendation 6.3 The ERO and FERC should develop best practices and standards in improving system-wide instrumentation and the ability of near-real-time state estimation and security assessments, since otherwise operators are at a disadvantage trying to understand and manage system disruptions as they unfold. System operators should be able to observe what is going on well beyond their own borders whenever necessary. Reliability coordinators can oversee larger areas, maybe comprising several balancing authorities, but new entities should be established to oversee the whole Western and Eastern interconnection.
Recommendation 6.4 Local load-serving entities should work with local private and public sector groups to identify critical customers and plan a series of technical and organizational arrangements that can facilitate restricted service to critical customers during times of system stress. DHS could accelerate this process by initiating and partially funding a few local and regional demonstrations that could provide examples of best practice for other regions across the country.
Anderson, K.L., D. Furey, and K. Omar. 2006. “Frayed Wires: U.S. Transmission System Shows Its Age.” Available at www.fitchratings.com. Summary available at http://tdworld.com/news/fitch-electric-transmission-report/. Accessed August 2007.
Ault, G.W., A. Cruden, and J.R. McDonald. 2000.“Specification and Testing of a Comprehensive Strategic Analysis Framework for Distributed Generation.” Pp. 1817-1822 in Proceedings of the 2000 IEEE Summer Power Engineering Society Meeting, Vol. 3. New York: IEEE.
Ault, G.W., J.R. McDonald, and G.M. Burt. 2003. “Strategic Analysis Framework for Evaluating Distributed Generation and Utility Strategies.” IEEE Proceedings—Generation, Transmission and Distribution 150(4): 475–481.
Barker, P.P., and R.W. De Mello. 2000. “Determining the Impact of Distributed Generation on Power Systems. I. Radial Distribution Systems.” Pp. 1645–1656 in Proceedings of the 2000 IEEE Power Engineering Society Summer Meeting, Vol. 3. New York: IEEE.
Benner, C.L., and B.D. Russell. 2004. “Investigation of Incipient Conditions Leading to the Failure of Distribution System Apparatus.” Pp. 703–708 in Proceedings of the IEEE PES Power Systems Conference and Exposition, Vol. 2. New York: IEEE.
Blumsack, S.A. 2006. Network Topologies and Transmission Investment Under Electric-Industry Restructuring. Ph.D. Thesis, Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, Pa.
Caire, R., N. Retiere, S. Martino, C. Andrieu, and N. Hadjsaid. 2002. “Impact Assessment of LV Distributed Generation on MV Distribution Network.” Pp. 1423–1428 in Proceedings of the 2000 IEEE Power Engineering Society Summer Meeting, Vol. 3. New York: IEEE.
Clark, H.K. 2004. “It’s Time to Challenge Conventional Wisdom.” Transmission & Distribution World. October 1. Available at http://tdworld.com/mag/power_time_challenge_conventional/index.html. Accessed August 2007.
Daly, P.A., and J. Morrison. 2001. “Understanding the Potential Benefits of Distributed Generation on Power Delivery Systems.” Pp. A2/1-A213 in Proceedings of the Rural Electric Power Conference. New York: IEEE.
Daley, J.M., and R.L. Siciliano. 2003a. “Application of Emergency and Standby Generation for Distributed Generation. I. Concepts and Hypotheses.” IEEE Transactions on Industry Applications 39(4): 1214–1225.
Daley, J.M., and R.L. Siciliano. 2003b. “Application of Emergency and Standby Generation for Distributed Generation. II. Experimental Evaluations.” IEEE Transactions on Industry Applications 39(4): 1226–1233.
Davis, W.K., and R.P. Stratford. 1988. “Operation of UPS on Emergency Generation.” Pp. 11–14 in Proceedings of the Industrial and Commercial Power Systems Technical Conference. Piscataway, N.J.: IEEE.
Dugan, R.C., T.E. McDermott, and G.J. Ball. 2001. “Planning for Distributed Generation.” IEEE Industry Applications Magazine 7(2): 80–88.
EPRI (Electric Power Research Institute). 2005. Distribution Fault Anticipator: Phase II Algorithm Development and Second-Year Data Collection. Final report prepared for the Electric Power Research Institute. Publication 1010662. Palo Alto, Calif.: EPRI. November, 58 pp.
Golshan, M.E.H., and S.A. Arefifar. 2006. “Distributed Generation, Reactive Sources and Network-Configuration Planning for Power and Energy-Loss Reduction.” IEEE Proceedings—Generation, Transmission and Distribution 153(2): 127–136.
Hsu, S.-M., H.J. Holley, W.M. Smith, and D.G. Piatt. 2000. “Voltage Profile Improvement Project at Alabama Power Company: A Case Study.” Pp. 2039–2044 in Proceedings of the 2000 IEEE Power Engineering Society Summer Meeting, Vol. 4. New York: IEEE.
IEEE (Institute of Electrical and Electronic Engineers). 1987. IEEE Recommended Practice for Emergency Standby Power Systems for Industrial and Commercial Applications. Std. 446. Piscataway, N.J.: IEEE.
Nedwick, P., A.F. Mistr Jr., and E.B. Croasdale. 1995. “Reactive Management: A Key to Survival in the 1990s.” IEEE Transactions on Power Systems 10(2): 1036–1043.
NERC (North American Electric Reliability Council). 2006a. Reliability Standards. Available at http://standards.nerc.net/. August 2007.
NERC. 2006b. Operating Manual. Available at http://www.nerc.com/~oc/operatingmanual.html. Accessed August 2007.
Taylor, C.W. 2001. “Power System Stability Controls.” Chapter 11.6 in The Electric Power Engineering Handbook. Boca Raton, Fla.: CRC Press/IEEE Press.
U.S.–Canada Power System Outage Task Force. 2004. Final Report on the August 14, 2003, Blackout in the United States and Canada: Causes and Recommendations. Natural Resources Canada and the U.S. Department of Energy. April. Available at http://www2.nrcan.gc.ca/es/erb/erb/english/View.asp?x=690oid=1221.
DOE/FERC (U.S. Department of Energy and Federal Energy Regulatory Commission). 2006. Steps to Establish a Real-time Transmission Monitoring System for Transmission Owners and Operators within the Eastern and Western Interconnection—A Report to Congress Pursuant to Section 1839 of the Energy Policy Act of 2005. February.
Yang, B., V. Vittal, and G.T. Heydt. 2006. “Slow-Coherency-Based Controlled Islanding—A Demonstration of the Approach on the August 14, 2003 Blackout Scenario.” IEEE Transactions on Power Systems 21(4): 1840–1847.
Zerriffi, H. 2004. Electric Power Systems Under Stress: An Evaluation of Centralized Versus Distributed System Architectures. Ph.D. Thesis, Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, Pa.