Trust in Cyberspace
Committee on Information Systems Trustworthiness, National Research Council (1999) 352 pages   6 x 9
    









2

Public Telephone Network
and Internet Trustworthiness

The public telephone network (PTN) and the Internet are both large NISs. Studying their trustworthiness thus gives insight into the technical problems associated with supporting trustworthiness in an NIS. Identifying the vulnerabilities in these networks is also valuable—any NIS is likely to employ one or both of these networks for its communication and could inherit those vulnerabilities.

In some ways, the Internet and PTN are very similar. No single entity owns, manages, or can even have a complete picture of either.

• The PTN in the United States comprises five distinct regional Bell operating companies and a large number of independent local telephone companies, all interconnected by long-distance providers.1

• The U.S. portion of the Internet consists of a few major Internet service providers (ISPs) along with a much larger number of local or regional network providers, sometimes referred to as downstream service providers (DSPs). The ISPs are interconnected, either by direct links or by using network access points distributed around the country.

• Both networks involve large numbers of subsystems operated by different organizations. The number and intricate nature of the interfaces that exist at the boundaries of these subsystems are one source of complexity for these networks. The increasing popularity of advanced services is a second source.


1Additional consolidation among the regional operating companies remains a real possibility; at the same time, pressure for competition in the local telephone market will probably increase the number of major players.


Public telephone network and internet trustworthiness 27

    











The vulnerabilities of the PTN and Internet are exacerbated by the dependence of each network on the other. Much of the Internet uses leased telephone lines as its physical transport medium. Conversely, telephone companies rely on networked computers to manage their own facilities, increasingly employing Internet technology, although not necessarily the Internet itself. Thus, vulnerabilities in the PTN can affect the Internet, and vulnerabilities in Internet technology can affect the telephone network.

This chapter, a study of vulnerabilities in the PTN and the Internet, has three parts. The first discusses the design and operation of both networks. The second examines environmental disruption, operational errors, hardware and software design and implementation errors, and malicious attacks as they apply to the networks. Finally, the chapter concludes by analyzing two emerging issues: Internet telephony and the expanding use of the Internet by business.

Network Design

The Public Telephone Network

Network Services and Design

The PTN has evolved considerably over the past decades. It is no longer simply a network comprising a set of linked telephone switches, many of which are connected by copper wires to each and every telephone instrument in the country. There are now many telephone companies that provide advanced services, such as toll-free numbers, call forwarding, network-based programmable call distribution, conference calling, and message delivery. The result is a network that is perhaps more flexible and responsive to customer needs but also more complex. The flexibility and complexity are sources of vulnerability.

Some of the advanced services also have intrinsic vulnerabilities. With call forwarding, for example, a caller can unknowingly reach a number different from the one dialed. Consequently, a caller can no longer make assumptions about what number a call will reach, and the recipient no longer knows what number a caller is intending to reach. Havoc could result if an attacker modified the telephone network's database of forwarding destinations.2 As a second example, with network-

2In one recent case, a plumber call forwarded his competitor's telephone number to his own, thereby gaining the callers' business without their knowledge of the deception. Call forwarding could also subvert the purpose of dial-back modems used for security. Here, the presumption is that only authorized users have access to certain telephone numbers.

28 trust in cyberspace

    











based programmable call distribution, a voice menu greets callers and allows a company to direct its incoming calls according to capabilities in different offices, time zones, and so on. The menus and distribution criteria can be modified directly by the company and uploaded into a telephone network database. But, as with call forwarding, a database that can be modified by telephone network customers constitutes a potential vulnerability.

The telephone network is made up of many different kinds of equipment that can be divided roughly into three major categories: signaling, transmission, and operations. Signaling equipment is used to set up and tear down calls. This category also includes databases and adjunct processors used for number translation and call routing. Transmission equipment carries the actual conversations. Operations equipment, including the operations support system (OSS), is used for provisioning, database updates, maintenance, billing, and the like.

All communication between modern central-office switches takes place over a dedicated data network using protocols, such as Signaling System 7 (SS7), which the switches use to set up calls, establish who pays for the call, return busy signals, and so on. Such out-of-band signaling helps prevent fraud (such as the deceptions of the 1960s and 1970s made possible by the infamous "blue boxes," which sent network control tones over the voice path) and helps conserve resources (i.e., no voice path need ever be allocated if the target number is busy). However, out-of-band signaling does introduce new vulnerabilities.3 Failure of the signaling path can prevent completion of a call, even if there is an available route for the call itself.

Authentication

Authentication is a key part of any scheme for preventing unauthorized activity. In a network containing programmable elements, authentication is an essential ingredient for protecting those elements from per

When such users try to log in, the site calls them back. But the system has no way of knowing whether the person who answers the callback is really the authorized user, and call forwarding could cause the callback to be redirected.

3SS7 messages are carried over a mix of private and public X.25 (data) networks, providing out-of-band signaling. However, such networks, especially public ones, are subject to various forms of attacks. There is even a curious semicircularity here, since the X.25 interswitch trunks usually are provisioned from telephone company long-distance circuits, although not from the switched circuits that SS7 manages. Owing to deregulation designed to foster competition, telephone companies must allow essentially anyone to connect into SS7 networks for a modest fee ($10,000). SS7 is a system that was designed for use by a closed community, and thus embodies minimal security safeguards. It is now employed by a much larger community, which makes the PTN subject to a broad range of "insider" attacks.


Public telephone network and internet trustworthiness 29

    











forming actions illicitly requested by attackers. Specifically, in the PTN, the OSSs must be able to authenticate requests in order to control changes in the configuration of the elements constituting the network. In addition, authentication is required to support certain advanced services, such as caller ID.4 To prevent caller ID from subversion, all elements in the path from the caller to the recipient must be authenticated.

The need for authentication by OSSs is growing because interconnections among previously isolated networks has increased the risk of external intrusions. As the PTN's management networks convert to the Transmission Control Protocol/Internet Protocol (TCP/IP) and are connected to other TCP/IP-based networks, ignoring authentication may prove disastrous. Historically, proprietary protocols and dedicated networks were used for the network's management, so knowledge of these was restricted to insiders, and there was little need for authentication or authorization of requests.

The Internet

Network Services and Design

The Internet, a successor to the ARPANET (McQuillan and Walden, 1977), is a worldwide packet-switched computer-communications network. It interconnects two types of processors: hosts and routers. Hosts are the source and destination for all communications; routers5 forward packets received on one communications line to another and thereby implement a communication. A shared set of protocols and service architecture was designed to provide support for various forms of robust communication (e.g., e-mail, remote terminal access, file transfer, the World Wide Web) despite outages and congestion. Little design effort was devoted to resisting attacks, although subsequent Department of Defense research has done so. And the designers elected to eschew service guarantees in favor of providing service on a "best effort" basis. For example, the Internet Protocol (IP), a datagram service used extensively by the Internet, does not guarantee delivery and can deliver duplicates of messages.6

4Caller ID is an advanced service that identifies the originator of a telephone call to a suitably equipped receiver. As this service becomes more pervasive, it will be used more and more for identification and authentication by systems employing the telephone network for communications. Here, then, is a vulnerability that can propagate from a communications fabric into an NIS that is built on top of that fabric.

5Routers sometimes act as hosts for purposes of network management and exchanging routing protocol messages.

6ISPs are now beginning to offer quality of service features (e.g., using RSVP), so the best-efforts notion of IP service may change over the next few years.


30 trust in cyberspace

    











The Internet's protocols have proven remarkably tolerant to changes in the size of the network and to decades of order of magnitude improvements in communications bandwidth, communications speed, and processor capacity. In electing for "best effort" services, the Internet's designers made it easier for their protocols to tolerate outages of hosts, routers, and communications lines. Selecting the weaker service model also simplified dealing with router memory and processing capacity limitations. The Internet protocols were designed to operate over a range of network technologies being explored by the military in the 1970s from 56-kbps ARPANET trunks to 10-Mbps Ethernets and a mix of satellite and low-speed tactical packet radio networks. Despite two decades of network technology evolution, these protocols perform relatively well in today's Internet, which has a backbone and other communications lines that are far faster.

Routing protocols in the Internet implement network-topology discovery, calculation of shortest routes, and recovery (i.e., alternate route selection) from link and router outages. Initially, all of the Internet's routers were owned and operated by a single entity, making it reasonable to assume that all routers were executing compatible protocols and none would behave maliciously. But as the Internet matured, ownership and control of the routers became disbursed. More robust but less cooperative routing protocols were developed, thereby limiting the Internet's vulnerability to malicious and faulty routers. The Exterior Gateway Protocol (Mills, 1984) was originally employed for communication with routers outside an originating domain; today, the Border Gateway Protocol (BGP) (Rekhter and Li, 1995; Rekhter and Gross, 1995; Traina, 1993, 1995) is used.

A routing protocol must resolve the tension between (1) performance gains possible given information about the far reaches of the network and (2) increased vulnerability that such dependence can bring. By trusting information received from other domains, a router can calculate near-optimal routes, but such routes are useless if based on inaccurate information provided by malicious or malfunctioning routers. Conversely, restricting the information that routers share allows routing tables to be smaller, hence cheaper to compute, but sacrifices control over route quality. Today's Internet routing protocols generally favor cost over route quality, but ISPs override this bias toward minimum hop routes in the context of interdomain routing.7

Communication in the Internet depends not only on the calculation of routing tables but also on the operation of the Domain Name Service

7ISPs use the local policy feature of the Border Gateway Protocol (BGP) to favor routes that might not be selected by BGP on a minimum-hop basis. This is necessary to balance traffic loads and to reduce vulnerability to configuration errors, or malicious attacks, on BGP.

Public telephone network and internet trustworthiness 31

    











(DNS) (Mockapetris, 1987a,b). The most important function of this service is to map host names, such as <www.nas.edu>, into numeric IP addresses. DNS also maps IP addresses into host names, defines inbound mail gateways, and so on. The name space implemented by DNS is tree structured. The top level has a handful of generic names (.COM, .NET, .GOV, and the like)8 as well as two-letter names corresponding to International Organization for Standardization (ISO) country codes (.US, .UK, .DE, .RU, and so forth). Definitive information for each level of the tree is maintained by a single master server; additional servers for a domain copy their information from it. Subtrees of the name space can be (and generally are) delegated to other servers. For example, .COM and .NET currently reside by chance on the same server as do the root name servers; .US, though, is delegated. Individual sites or machines may cache recently retrieved DNS records; the intended lifetime of such cache entries is controlled by the source of the cached records.

Network management tasks in the Internet are implemented using the Simple Network Management Protocol (SNMP) (Case et al., 1990). SNMP itself is quite elementary—it merely uses the User Datagrams Protocol (UDP) to read and alter predefined parameters. These parameters, called management information bases (MIBs), are organized in a tree structure with branches representing MIB type, protocol structure, device type, and vendor. The hard task in managing a network is not the mechanics of changing values of parameters; it is knowing what MIB variables to set in order to effect some desired change in network behavior. SNMP provides no assistance here. Most of the deployed implementations of SNMP also lack good security features, so the protocol has been used primarily to retrieve data from MIBs in managed devices, not to make changes to these MIBs. Instead, Telnet, a protocol that can be used with a variety of user authentication technologies, is often used for modification of MIBs. The latest version (3) of SNMP promises to overcome these security limitations.

Perhaps the most visible Internet service is the World Wide Web.9 The Web is implemented by servers that communicate with Web browsers (clients) using the Hypertext Transfer Protocol (HTTP) (Berners-Lee et al., 1996) to retrieve documents represented in Hypertext Markup Language (HTML) (Berners-Lee and Connolly, 1995). HTML documents con

8At this time, there is an active debate over how many new top-level names to add and who should make the decisions. The outcome of this debate may change some of the details presented here; the overall structure, however, is likely to remain the same. Several of the generic top-level domain names are decidedly U.S.-centric. .MIL and .GOV are restricted to U.S. military and government organizations, and most of the entries in the .EDU domain are from the United States.

9Indeed, many think that the Web is the Internet.


32 trust in cyberspace

    











tain data (text, images, audio, video, and so on), as well as uniform resource locators (URLs) (Berners-Lee et al., 1994) to reference other HTML documents. An HTML document can be a file stored by a Web server or the output from a program, known as a common gateway interface (CGI) script, run by the Web server in response to a client request. CGI scripts, although not necessarily installed or managed by system administrators, are basically network servers accessible to Internet users. Bugs, therefore, can be a source of vulnerability.

HTTP treats each client request as separate and independent. Thus, information about past interactions must be stored and retrieved explicitly by the server in processing each request, usually an unnatural style of programming. The information can be stored by the client, as "cookies" (Kristol and Montulli, 1997) or as hidden fields in URLs and forms, or it can be stored by the server, or it can be stored as part of a secure socket layer10 (SSL) session index (if the HTTP session is being cryptographically protected). Observe that with the latter two schemes, the server's state becomes visible to the client and the client must implement any security.

HTTP uses TCP and makes large numbers of short-lived TCP connections (even between the same pairs of hosts). TCP, however, was designed to support comparatively long-lived connections. Web browsers thus cannot benefit from TCP's congestion-control algorithms (Stevens, 1997; Jacobson, 1988). That means that the load imposed by the Web on the Internet's routers and communications lines not only is disproportionately high but also reduces network throughput. Although HTTP 1.1 (Fielding et al., 1997) is mitigating this particular problem, it does exemplify a broader concern: Deploying an application that does not match assumptions made by the Internet's designers can have a serious global impact on Internet performance.

For implementing a trustworthy NIS, the Internet's "best effort" service semantics is probably not good enough. Bandwidth, latency, route diversity, and other quality of service (QOS) guarantees are likely to be needed by an NIS. Efforts are under way to correct this Internet deficiency. But accommodating QOS guarantees seems to require revisiting a fundamental architectural tenet of the Internet—that intelligence and state exist only at the network's periphery. The problem is that, without adding state to routers (i.e., the "inside" of the network), the Internet's routers would lack a basis for processing some packets differently from others to enforce differing QOS guarantees.

The most ambitious scheme to provide QOS guarantees in the Internet relies on the new Resource Reservation Protocol (RSVP) (Braden et al., 1997). This protocol transmits bandwidth requests to the routers in a

10Available on line at <http://home.netscape.com/eng/ssl3/ssl-toc.html>.

Public telephone network and internet trustworthiness 33

    











communications path on a hop-by-hop basis. The receiver makes a request of an adjacent router; that router, in turn, passes the request to its predecessor, and so on, until the sender is reached. (Special messages convey the proper path information to the receiver, and thence to each router.) The RSVP bandwidth requests feed the Internet's integrated services model (Shenker and Wroclawski, 1997) with parameters that include bandwidth, latency, and maximum packet size. With RSVP, bandwidth reservations in routers are not permanent. They may be relinquished explicitly or, if not periodically refreshed, they expire.

Note that RSVP reservations are not required for packets to flow. The term "soft state" has been coined for such saved information—information whose loss may impair performance but does not disrupt functional correctness (i.e., the Internet's "best effort" semantics). The use of soft state in RSVP means that changes in routings or the reboot of a router cannot cause a communications failure, and packets will continue to flow, albeit without performance guarantees. By periodically refreshing reservations, performance guarantees can be reactivated.

Differentiated service, an alternative to RSVP for providing QOS in the Internet, employs bits in packet headers to indicate classes of service. Each class of service has associated service guarantees. The bits are inspected at network borders, and each network is responsible for taking appropriate measures in order to satisfy the guarantees.

Authentication (and other Security Protocols)

Concern about strong and useable authentication in the Internet is relatively new. The original Internet application protocols used plaintext passwords for authentication, a mechanism that was adequate for casual log-ins but was insufficient for more sophisticated uses of a network, especially in a local area network environment. Rather than build proper cryptographic mechanisms—which were little known in the civilian sector at that time—the developers of the early Internet software for UNIX resorted to network-based authentication for remote log-in and remote shell commands. The servers checked their clients' messages by converting the sender's IP address into a host name. User names in such messages are presumed to be authentic if the message comes from a host whose name is trusted by the server. Senders, however, can circumvent the check by misrepresenting their IP address11 (something that is more difficult with TCP).


11A number of different attacks are known. They can be accomplished in a number of ways, such as sequence number guessing (Morris, 1985) or route corruption (Bellovin, 1989). Alternatively, the attacker can target the address-to-name translation mechanism (Bellovin, 1995).

34 trust in cyberspace

    











BOX 2.1

Open Systems Interconnection Network Layers

Physical link: Mechanical, electrical, and procedural interfaces to the transmission medium that convert it into a stream that appears to be free of undetected errors

Network: Routes from sender to receiver within a single network technology and deals with congestion (X.25, frame relay, and asynchronous transfer mode fall into this layer)

Internetwork: Sometimes combined with the network layer; provides routing and relay functions from the sender to the receiver and deals with congestion (Internet Protocol falls into this layer)

Transport: Responsible for end-to-end delivery of data (Transmission Control Protocol and User Datagram Protocol fall into this layer)

Session: Allows multiple transport-layer connections to be managed as a single unit; not used on the Internet

Presentation: Chooses common representations, typically application dependent, for data; rarely used on the Internet

Application: Deals with application-specific protocols

But cryptographic protocols—a sounder basis for network authentication and security—are now growing in prominence on the Internet. Link-layer encryption has been in use for many years. (See Box 2.1 for the names and descriptions of various network layers.) It is especially useful when just a few links in a network need protection. (In the latter days of the ARPANET, MILNET trunks outside the continental United States were protected by link encryptors.) Although link-layer encryption has the advantage of being completely transparent to all higher-layer devices and protocols, the scope of its protection is limited. Accordingly, attention is now being focused on network-layer encryption (see Box 2.2). Network-layer encryption requires no modification to applications, and it can be configured to protect host-to-host, host-to-network, or network-to-network traffic. Cost thus can be traded against granularity of protection.

Network-layer encryption is instantiated in the Internet as the IP Security (IPsec) protocol, which is designed to run on the Internet's hosts and routers, or on hardware outboard to either.12 The initial deployment of IPsec has been in network-to-network mode. This mode allows virtual private networks to be created so that the otherwise insecure Internet can be incorporated into an existing secure network, such as a corporate net

12RFC 2401, Security Architecture for the Internet Protocol, and RFC 2411, IP Security Document Roadmap, are both forthcoming (<ftp://ftp.isi.edu/in-notes>).

Public telephone network and internet trustworthiness 35

    











BOX 2.2

A History of Network-level Encryption

Link-level encryption is an old idea. It first emerged in the form of Vernam's online teletype encryptor in 1917 (Kahn, 1976). Various forms were used by assorted combatants during World War II. But link encryption has a number of drawbacks, notably a very limited scope of protection. This is especially problematic for a multinode network like the ARPANET or the Internet, in which every single link must be protected and messages exist in plaintext at every intermediate hop. Encryption at this level is also a rather complex problem if the link level itself is a multiaccess network.

The military used link encryption with ARPANET technology to protect the communications lines connecting interface message processors (IMPs) in several Department of Defense packet networks. The difficulties of scaling this technology economically to some environments led to the development of the private line interface (PLI) encryptor (BBN, 1978), which operated at (for the ARPANET) the network layer. With the advent of the Internet and the presumed imminent arrival of Open Systems Interconnection (OSI) networks, it rapidly became obvious that a more flexible encryption strategy was necessary. The result was Blacker (Weissman, 1992), which sat between a host and an IMP and operated on X.25 packets. Blacker ignored Internet Protocol (IP) addresses (although these had been mapped algorithmically into X.25 addresses by the host); it did, though, look at the security labels in the IP header.

As IMPs fell out of favor as the preferred switches, a new hardware strategy was necessary. Furthermore, the National Security Agency wanted to use public-key technology—a success in the Secure Telephone Unit III (STU III) deployment—for data. Accordingly, the Secure Data Network System (SDNS) project devised a true network-layer encryption standard known as Security Protocol at Level 3 (SP3). SP3 could operate directly over X.25 networks; it also could (and generally did) operate with OSI or IP network-layer headers below it. It could handle host-to-host, host-to-network, and network-to-network encryption. Several SP3 devices, such as Caneware and the Network Encryption System (NES), were built and deployed.

This standard achieved a fundamental advance by enabling network managers or designers to trade cost for granularity of protection. The other fundamental advance in SP3 was the separation of the key-management protocol from the actual cryptographic layer. In effect, key management became just another application, tremendously simplifying the entire concept. SP3 served as the model for OSI's Network- Layer Security Protocol (NLSP), but the protocol was complicated by the need to work with both connection-oriented and connectionless network layers, and very few NLSP products were ever deployed.

Both SDNS and OSI also specified transport-level encryption protocols (SP4 and TLSP, respectively). These never caught on, and they appear to be an evolutionary dead end.

SP3 was the inspiration for swIPe (Ioannidis and Blaze, 1993), a simple host-based IP encryptor. This, in turn, gave rise to the Internet Engineering Task Force's working group on IPsec. Although IP Security (IPsec) is, in many ways, very similar to SP3, its overall model is more complete. Much more attention was paid to issues such as firewall integration, selective bypass (one need not encrypt traffic to all destinations), and so on. The initial deployment of IPsec appears to be in network-to-network mode; host-to-network mode, for telecommuters, appears to be following closely behind.


36 trust in cyberspace

    











work. The next phase of deployment for IPsec will most likely be the host-to-network mode, with individual hosts being laptops or home machines. That would provide a way for travelers to exploit the global reach of the Internet to access a secure corporate network.

It is unclear when general host-to-host IPsec will be widely deployed. Although transparent to applications, IPsec is not transparent to system administrators—the deployment of host-to-host IPsec requires outboard hardware or modifications to the host's protocol system software. Because of this impediment to deploying IPsec, the biggest use of encryption in the Internet is currently above the transport layer, as SSL embedded into popular Web browsers and servers. SSL, although quite visible to its applications, affects only those applications and not the kernel or the hardware. SSL can be deployed without supervision by a central authority, the approach used for almost all other successful elements of Internet technology.

Higher still in the protocol stack, encryption is found in fairly widespread use for the protection of electronic mail messages. In this manner, an e-mail message is protected during each Simple Mail Transfer Protocol (Postel, 1982), while spooled on intermediate mail relays, while residing in the user's mailbox, while being copied to the recipient's machine, and even in storage thereafter. However, no secure e-mail format has been both standardized by the Internet Engineering Task Force (IETF) and accepted by the community. Two formats that have gained widespread support are S/MIME (Dusse et al., 1998a,b) and PGP (pretty good privacy) (Zimmerman, 1995). Both have been submitted to the IETF for review.

Findings

1. The PTN is becoming more vulnerable as network elements become dependent on complex software, as the reliance on call-translation databases and adjunct processors grows, and as individual telephone companies increasingly share facilities with the Internet.

2. As the PTN is increasingly managed by OSSs that are less proprietary in nature, information about controlling OSSs will become more widespread and OSSs will be vulnerable to larger numbers of attackers.

3. New user services, such as caller ID, are increasingly being used to provide authenticated information to customers of the PTN. However, the underlying telephone network is unable to provide this information with high assurance of authenticity.

4. The Internet is becoming more secure as its protocols are improved and as enhanced security measures are more widely deployed at higher levels of the protocol stack. However, the Internet's hosts remain vulnerable, and the Internet's protocols need further improvement.


Public telephone network and internet trustworthiness 37

    











5. The operation of the Internet depends critically on routing and name to address translation services. This list of critical services will likely expand to include directory services and public-key certificate servers, thereby adding other critical dependencies.

6. There is a tension between the capabilities and risks of routing protocols. The sharing of routing information facilitates route optimization, but such cooperation also increases the risk that malicious or malfunctioning routers can compromise routing.

Network Failures and Fixes

This section examines some causes for Internet and PTN failures. Protective measures that already exist or might be developed are also discussed. The discussion is structured around the four broad classes of vulnerabilities described in Chapter 1: environmental disruption, operational errors, hardware and software design and implementation errors, and malicious attacks.

Environmental Disruption

In this report, environmental disruption is defined to include natural phenomena, ranging from earthquakes to rodents chewing through cable insulation, as well as accidents caused by human carelessness. Environmental disruptions affect both the PTN and the Internet. However, the effects and, to some extent, the impact of different types of disruption differ across the two networks.

Link Failures

The single biggest cause of PTN outages is damage to buried cables (NRIC, 1997). And the single biggest cause of this damage is construction crews digging without proper clearance from telecommunications companies and other utilities. The phenomenon, jocularly known in the trade as "backhoe fading," is probably not amenable to a technological solution. Indeed, pursuant to the Network Reliability and Interoperability Council (NRIC) recommendation, the Federal Communications Commission (FCC) has requested legislation to address this problem.13

The impact of backhoe fading on network availability depends on the redundancy of the network. Calls can be routed around failed links, but only if other links form an equivalent path. Prior to the 1970s, most of the

13Both the proposed text and the letter to Congress are available online at <http://www.fcc.gov/oet/nric>.

38 trust in cyberspace

    











nation's telephone network was run by one company, AT&T. As a regulated monopoly, AT&T was free to build a network with spare capacity and geographically diverse, redundant routings. Multiple telephone companies compete in today's market, and cost pressures make it impractical for these telephone companies to build and maintain such capacious networks. Furthermore, technical innovations, such as fiber optics and wave division multiplexing, enable fewer physical links to carry current levels of traffic. The result is a telephone network in which failure of a single link can have serious repercussions.

One might have expected that having multiple telephone companies would contribute to increased capacity and diversity in the telephone network. It does not. Major telephone companies lease circuits from each other to lower their own costs. This practice means that backup capacity may not be available when needed. To limit outages, telephone companies have turned to newer technologies. Synchronous optical network (SONET) rings, for example, provide redundancy and switch-over at a level below the circuit layer, allowing calls to continue uninterrupted when a fiber is severed. Despite the increased robustness provided by SONET rings, the very high capacity of fiber optic cables results in a greater concentration of bandwidth over fewer paths because of economic considerations. This means that the failure, or sabotage, of a single link will likely disrupt service for many customers.

The Internet, unlike the PTN, was specifically designed to tolerate link outages. When a link outage is detected, the Internet routes packets over alternate paths. In theory, connections should continue uninterrupted. In practice, though, there may not be sufficient capacity to accommodate the additional traffic on alternate paths. The Internet's routing protocols also do not respond immediately to notifications of link outages. Having such a delay prevents routing instabilities and oscillations that would swamp routers and might otherwise arise in response to transient link outages. But these delays also mean that, although packets are not lost when a link fails, packet delivery can be delayed. In addition to the route damping noted here, there is a disturbing trend for ISPs to rely on static configuration of primary and backup routes in BGP border routers. This means that Internet routing is less dynamic than was originally envisioned. The primary motivations for this move away from less-constrained dynamic routing are a desire for increased route stability and reduced vulnerability to attacks or configuration errors by ISPs and DSPs.

Congestion

Congestion occurs when load exceeds capacity. Environmental disruptions cause increased loads in two ways. First, the load may come


Public telephone network and internet trustworthiness 39

    











from outside the network—for example, from people checking by telephone with friends and relatives who live in the area of an earthquake. Second, the load may come from within the network—existing load that is redistributed in order to mask outages caused by the environmental disruption. In both scenarios, network elements saturate, and the consequences are an inability to deliver service, perhaps at a time when it is most needed.

The PTN is able to control congestion better than the Internet is. When a telephone switch or telephone transmission facility reaches saturation, new callers receive "reorder" (i.e., "fast" busy) signals and no further calls are accepted. This forestalls increased load and congestion. PTN operations staff can even block call attempts to a given destination at sources, thereby saving network resources from being wasted on calls that are unlikely to be completed. For example, when an earthquake occurs near San Francisco, the operations staff might decide to block almost all incoming calls to the affected area codes from throughout the entire PTN.

Congestion management in the Internet is problematic, in part, because no capabilities exist for managing traffic associated with specific users, connections, sources, or destinations, and it would be difficult to implement such capabilities. All that a simple router can do14 is discard packets when its buffers become full. To implement fairness, routers would have to store information about users and connections, something they are not built to do. Retaining such information would require large amounts of storage. Managing this storage would be difficult, because the Internet has no call-teardown messages that are visible to routers. Furthermore, the concept of a "user"—that is, an entity that originates or receives traffic—is not part of the network or transport layers of the Internet protocols.

Choking-back load offered by specific hosts (in analogy with PTN reorder signals) is also not an option for preventing Internet congestion, since an IP-capable host can have connections open to many destinations concurrently. Stopping all flows from the host is clearly inappropriate. More generally, avoiding congestion in the Internet is intrinsically hard because locales of congestion (i.e., routers and links) have no straightforward correspondence to the communications abstractions (i.e., connections) that end points see. This problem is particularly acute for the highly dynamic traffic flows between ISPs. Here, very high speed (e.g.,

14In fact, routers can transmit an ICMP (Internet Control Message Protocol) Source Quench message to advise a host of congestion, but there has never been a standard, accepted response to receipt of a Source Quench, and many hosts merely ignore such messages. In such circumstances the resources needed to construct and send the Source Quench may be wasted and may compound the problem!

40 trust in cyberspace

    











OC-12) circuits are used to carry traffic between millions of destinations over short intervals, and the traffic mix can completely change over a few seconds.

Although congestion in the Internet is nominally an IP-layer phenomena—routers have too many packets for a given link—measures for dealing successfully with congestion have resided in the TCP layer (Jacobson, 1988). Some newer algorithms work at the IP level (Floyd and Jacobson, 1993), but more research is needed, especially for defining and enforcing flexible and varied policies for congestion control. One suggestion involves retaining information about flows from which packets have been repeatedly dropped. Such flows are deemed uncooperative and, as such, are subjected to additional penalties (Floyd and Fall, 1998); cooperating flows respond to indications of congestion by slowing down their transmissions.

More research is also needed to measure and understand current Internet traffic as well as expected future trends in that traffic. Some work has been done (e.g., Thompson et al., 1997), but far too little is known about usage patterns, flow characteristics, and other relevant parameters. Having such information is likely to enable better congestion control methods. However, usage patterns are dictated by the application designs and, as new applications arise and become popular, traffic characteristics change. Today, the use of the Web has changed packet sizes radically compared to a time when file transfer and e-mail were the principal applications. Even within the Web environment, when a very popular Web site arises, news of its location spreads quickly, and traffic flows shift noticeably!

Two further difficulties are associated with managing congestion in networks. First, there appears to be a tension between implementing congestion management and enforcing network security. A congestion control mechanism may need to inspect and even modify traffic being managed, but strong network security mechanisms will prohibit reading and modifying traffic en route. For example, congestion control in the Internet might be improved if IP and TCP headers were inspected and modified, but the use of IPsec will prevent such actions.

A second difficulty arises when a network comprises multiple independent but interconnected providers. In the Internet, no single party is either capable of or responsible for most end-to-end connections, and local optimizations performed by individual providers may lead to poor overall utilization of network resources or suboptimal global behavior. In the PTN, which was designed for a world with comparatively few telephone companies but in which switches can be trusted, competitive pressures are now forcing telephone companies to permit widespread interconnections


Public telephone network and internet trustworthiness 41

    











between switches that may not be trustworthy. This opens telephone networks to both malicious and nonmalicious failures (NRIC, 1997).

Findings

1. Technical and market forces have reduced reserve capacity and the number of geographically diverse, redundant routings in the PTN. Failure of a single link can now have serious repercussions.

2. Current Internet routing algorithms are inadequate. They do not scale well, they require CPU (central processing unit)-intensive calculations, and they cannot implement diverse or flexible policies. Furthermore, little is known about how best to resolve the tension between the stability of routing algorithms and the delay that precedes a routing change in response to an outage.

3. A better understanding is needed of the Internet's current traffic profile and how it will evolve. In addition, fundamental research is needed into mechanisms for supporting congestion management in the Internet, especially congestion management schemes that do not conflict with enforcing network security.

4. Networks formed by interconnecting extant independent subnetworks present unique challenges for controlling congestion (because local provider optimizations may not lead to good overall behavior) and for implementing security (because trust relationships between network components are not homogeneous).

Operational Errors

"To err is human" the saying goes, and human operator errors are indeed responsible for network outages, as well as for unwittingly disabling protection mechanisms that then enable hostile attacks to succeed. Located in a network operations center (see Box 2.3), operators take actions based on their perceptions of what the network is doing and what it will do, but without direct knowledge of either. In these circumstances, the consequences of even the most carefully considered operator actions can be surprising—and devastating.

With regard to the PTN, the Network Reliability and Interoperability Council found that operational errors caused about one in every four telephone switch failures (NRIC, 1996). Mistakes by vendors, mistakes in installation and maintenance, and mistakes by system operators all contributed. For example, in 1997, an employee loading an incorrect set of translations into an SS7 processor led to a 90-minute network outage for toll-free telephone service (Perillo, 1997), and the recent outage of the


42 trust in cyberspace

    











BOX 2.3

Network Operations Centers

Each public telephone network (PTN) or Internet constituent has some form of network operations center (NOC). For a small downstream service provider (DSP), the NOC may be a portion of a room in a home or office. For a local telephone company, long-distance carrier, or national-level Internet service provider (ISP), an NOC could occupy considerably more space and likely will involve substantial investments in equipment and infrastructure. A large network provider may have multiple, geographically dispersed NOCs in order to share the management load and provide backup.

The purpose of an NOC is to monitor and control the elements of a network: switches, transmission lines, access devices, and so on. Human operators monitor a variety of graphical images of network topology (physical and logical) that show the status of network elements. Ordinary computer monitors often serve as these display devices.1 A typical display could indicate which switch interfaces or switches appear to be malfunctioning, or which circuits are out of service. Some displays may even indicate which links are approaching saturation.

The displays rarely tell an operator how to solve a problem whose symptoms are being depicted. Human understanding of network operation (with help from automated tools) must be brought to bear. For example, PTN switches are configured with secondary and tertiary routes (selected through the use of offline network analysis tools) that can be used when a primary link fails or becomes saturated. And Internet routers execute algorithms to determine automatically the shortest routes to each destination. But there is also considerable manual configuration of constraints on routing, especially at the interfaces between ISPs.

Most NOC operators are trained to deal with common problems. If the operator does not know how to deal with a problem, then an operations manual usually is

AT&T frame relay network (Mills, 1998) was attributed in part to operational procedures.15

The Internet has also been a victim of operational errors, although the frequency and specific causes have not been analyzed thoroughly as for the PTN. Examples abound, however. Perhaps the most serious incident occurred in July 1997, when a process intended to generate a major part of the DNS from a database failed. Automated mechanisms alerted operators that something was wrong, but a system administrator overrode the warning, causing the apparent deletion of most machines in that zone. There are also numerous instances of the bogus information stored by misconfigured DNS servers propagating into name server caches and then confusing machines throughout the Internet. Similar problems have occurred with regard to Internet routing as well. For example, in April 1997, a small ISP

15Two independent software bugs also contributed to this frame relay network outage.

Public telephone network and internet trustworthiness 43

    











available for consultation. The manual is important because of the complexity of the systems and the difficulty of attracting, training, and retaining highly skilled operators to provide 24-hour, 7-day coverage in the NOC. However, operations manuals usually cover only a predetermined set of problems; combinations of failures can easily lead to symptoms and problems not covered by the manual. For problems not covered, the usual procedure is to contact an expert, who may be on call for such emergencies. In the Internet environment, the expert might be able to access the NOC (e.g., via a dial-up link) to assist in diagnosis and corrective action. (Note, though, that having facilities for remote access introduces new vulnerabilities.)

The set of controls available to NOC operators is network specific. In the PTN, there are controls for rerouting calls through switches and multiplexors, for blocking calls to a particular area code or exchange during natural disasters, and so on. In an ISP, there are controls for changing router tables and multiplexors, among other things. In both the PTN and an ISP, the NOC will have provisions for calling out physical maintenance teams when, for example, a cable breaks or a switching element fails. A telephone company often services its own equipment, but external maintenance must be ordered for the equipment of another provider; external maintenance in the Internet is common because ISPs typically rely on equipment provided by many vendors, including long-distance and local telephone companies. Consolidation in the Internet business may blur these distinctions, as most long-distance telephone companies are also major ISPs.

1Many NOCs also have one or more televisions, usually tuned to news channels such as CNN, to provide information about events such as natural disasters that may affect network traffic (e.g., earthquakes). Some events can cause disruption of service owing to equipment failures, or may create traffic surges because of breaking news (e.g., announcement of a toll-free number).

claimed to be the best route to most of the Internet. Its upstream ISP believed the claim and passed it along. Routing in the Internet was then disrupted for several hours because of the traffic diverted to this small ISP.

Exactly what constitutes an operational error may depend on system capacity. A system operating with limited spare capacity can be especially sensitive to operational missteps. For example, injecting inappropriate, but not technically incorrect, routing information led to a day-long outage of Netcom's (a major ISP) own internal network in June 1996 as the sheer volume of resulting work overloaded the ISP's relatively small routers. And this incident may foreshadow problems to come—many routers in the Internet are operating near or at their memory or CPU capacity. It is unclear how well the essential infrastructure of the Internet could cope with a sudden spike in growth rates.

That operator errors are prevalent should not be a surprise. The PTN and the Internet are both complex systems. Large numbers of separate and controllable elements are involved in each, and the control param


44 trust in cyberspace

    











eters for these elements can affect network operation in subtle ways. Operator errors can be reduced when a system does the following:

• Presents its operators with a conceptual model that allows those operators to predict the effects of their actions and their inaction (Wickens et al., 1997; Parasuraman and Mouloua, 1996);

• Allows its operators to examine all of the system's abstractions, from the highest to the lowest level, whichever is relevant to the issue at hand.

The entire system must be designed—from the outset—with controllability and understandability as a goal. The reduction of operational errors is more than a matter of building flashy window-based interfaces. The graphics are the easy part. Moreover, with an NIS, there is the added problem of components with different management interfaces provided by multiple vendors. Rarely can the NIS developer change these components or their interfaces, which may make the support of a clean systemwide conceptual model especially difficult.

An obvious approach to reducing operational errors is simply to implement automated support and remove the human from the loop. The route-configuration aids used by PTNs are an example of such automation. More generally, better policy-based routing mechanisms and protocols will likely free human operators from low-level details associated with setting up network routes. In the Internet, ISPs currently have just one policy tool: their BGP configurations (Rekhter and Li, 1995; Rekhter and Gross, 1995; Traina, 1993, 1995). But even though BGP is a powerful hammer, the sorts of routing policies that are usually desired do not much resemble nails. Not surprisingly, getting BGP configurations right has proven to be quite difficult. Indeed, the internal network failure mentioned above was directly attributable to an error in use of the BGP policy control mechanisms.

Finally, operational errors are not only a matter of operators producing the right responses. Maintenance practices—setting up user accounts and access privileges, for example—can neutralize existing security safeguards. And poor maintenance is an oft-cited opening for launching a successful intrusion into a system. The network operations staff at the Massachusetts Institute of Technology, for example, reports that about 6 weeks after running vulnerability-scan software (e.g., COPS) on a public UNIX workstation, the workstation will again become vulnerable to intrusion as a result of misconfiguration. Managers of corporate or university networks often cite similar problems with firewall and router configuration which, if performed improperly, can lead to access control violations or denial of service.


Public telephone network and internet trustworthiness 45

    











Findings

1. Operational errors are a major source of outages for the PTN and Internet. Some of these errors would be prevented through improved operator training and contingency planning; others require that systems be designed with operator understandability and controllability as an initial design goal.

2. Improved routing management tools are needed for the Internet, because they will free human operators from an activity that is error prone.

3. Research and development is needed to devise conceptual models that will allow human operators to grasp the state of a network and understand the consequences of control that they may exert. Also, research is needed into ways in which the state of a network can be displayed to a human operator.

Software and Hardware Failures

The PTN and Internet both experience outages from errors in design and implementation of the hardware and software they employ. A survey by the NRIC (1996) found that software and hardware failures each accounted for about one-quarter of telephone switch outages. This finding is inconsistent with the commonly held belief that hardware is relatively bug free but software is notoriously buggy. A likely explanation comes from carefully considering the definition of an outage. Within telephone switches, software failures are prone to affect individual telephone calls and, therefore, might not always be counted as causing outages.

Comparable data about actual outages of Internet routers do not seem to be available. One can speculate that routers should be more reliable than telephone switches, because router hardware is generally newer and router software is much simpler. However, against that, one must ask whether routers are engineered and provisioned to the same high standards as telephone switches have been. Moreover, most failures in packet routing are comparatively transient; they are artifacts of the topology changes that routing protocols make to accommodate a failure, rather than being direct consequences of the failure itself.

One thing that is fairly clear is that the Internet's end points, including servers for such functions as the DNS, are its least robust components. These end points are generally ordinary computers running commercial operating systems and are heir to all of their attendant ills. (By contrast, telephony end points either tend to be very simple, as in the case of the ordinary telephone, or are built to telephone industry standards.) Two examples illustrate the fragility of the Internet's end points. First, many


46 trust in cyberspace

    











problems have been reported with BIND, the most common DNS server used on the Internet (e.g., CERT Advisories CA 98.05, April 1998, and CA 97.22, August 199716); some of these result in corrupted data or in DNS failures. Second, the so-called "ping of death" (CERT Advisory CA-96.26, December 1996) was capable of crashing most of the common end points on the Internet. Fortunately, Cisco routers were not vulnerable; if they had been, the entire infrastructure would have been at risk.

Even without detailed outage data, it can be instructive to compare the PTN and Internet; their designs differ in rather fundamental ways, and these differences affect how software and hardware failures are handled. The PTN is designed to have remarkably few switches, and it depends on them. That constraint makes it necessary to keep all its switches running virtually all the time. Consequently, switch hardware itself is replicated, and the switch software is tasked with detecting hardware and software errors. Upon detecting an error, the software recovers quickly without a serious outage of the switch itself. Individual calls in progress may be sacrificed, though, to restore the health of the switch.

This approach does not work for all hardware and software failures. That was forcefully illustrated by the January 1990 failure of the AT&T long-distance network. That outage was caused by a combination of hardware and software, and the interaction between them:17

The incident began when a piece of trunk equipment failed and notified a switch of the problem. Per its design, the switch took itself offline for a few seconds while it tried to reinitialize the failing equipment; it also notified its neighbors not to route calls to it. When the switch came back on-line, it started processing calls again; neighboring switches were programmed to interpret the receipt of new call setup messages as an indication that the switch had returned to service. Unfortunately, a timing bug in a new version of that process caused those neighboring switches to crash. This crash was detected and (correctly) resulted in a rapid restart—but the failure/restart process triggered the same problem in their neighbors.

The "switches" for the Internet—its routers—are also intended to be reliable, but they are not designed with the same level of redundancy or error detection as PTN switches. Rather, the Internet as a whole recovers and compensates for router (switch) failures. If a router fails, then its neighbors notice the lack of routing update messages and update their

16CERT advisories are available online at <http://www.cert.org>.

17Based on Cooper (1989).


Public telephone network and internet trustworthiness 47

    











own route tables accordingly. As neighbors notify other neighbors, the failed router is dropped from possible packet routes. In the meantime, retransmissions by end points preserve ongoing conversations by causing packets that might have been lost to reenter the network and traverse these new routes.

Finding

Insufficient data exist about Internet outages and how the Internet's mechanisms are able to deal with them.

Malicious Attacks

Attacks on the PTN and Internet fall into two broad categories, according to the nature of the vulnerability being exploited. First, there are attacks related to authentication. This category includes everything from eavesdroppers' interception of plaintext passwords to designers' misplaced trust in the network to provide authentication. In theory, these attacks can be prevented by proper use of cryptography. The second category of attacks is harder to prevent. This category comprises attacks that exploit bugs in code. Cryptography cannot help here (Blaze, 1996), nor do other simple fixes appear likely. Software correctness (see Chapter 3) is a problem that does not seem amenable to easy solutions. Yet, as long as software does not behave as intended, attackers will have opportunities to subvert systems by exploiting unintended system behavior.

Attacks on the Telephone Network

Most attacks on the PTN perpetrate toll fraud. The cellular telephony industry provides the easiest target, with caller information being broadcast over unencrypted radio channels and thus easily intercepted (CSTB, 1997). But attacks have been launched against wireline telephone service as well. Toll fraud probably cannot be prevented altogether. Fortunately, it does not have to be, because it is easily detected with automated traffic analysis that flags for investigation of abnormal patterns of calls, credit card authorizations, and other activities.

The NRIC (1997) reports that security incidents have not been a major problem in the PTN until recently. However, the council does warn that the threat is growing, for reasons that include interconnections (often indirect) of OSSs to the Internet, an increase in the number and skill level of attackers, and the increasing number of SS7 interconnections to new telephone companies. The report also notes that existing SS7 firewalls are neither adequate nor reliable in the face of the anticipated threat. As


48 trust in cyberspace

    











noted earlier, this threat has increased dramatically because of the substantially lower threshold now associated with connection into the SS7 system.

Routing Attacks. To a would-be eavesdropper, the ability to control call routing can be extremely useful. Installing wiretaps at the end points of a connection may be straightforward, but such taps are also the easiest to detect. Interoffice trunks can yield considerably more information to an eavesdropper and with a smaller risk of detection. To succeed here, the eavesdropper first must determine which trunks the target's calls will use, something that is facilitated by viewing or altering the routing tables used by the switches. Second, the eavesdropper must extract the calls of interest from all the calls traversing the trunk; access to the signaling channels can help here.

How easy is it for an eavesdropper to alter routing tables? As it turns out, apart from the usual sorts of automated algorithms, which calculate routes based on topology, failed links, or switches, the PTN does have facilities to exert manual control over routes. These facilities exist to allow improved utilization of PTN equipment. For example, there is generally a spike in business calls around 9:00 a.m. on weekdays when workers arrive in their offices. If telephone switches in, say, New York are configured to route other East Coast calls through St. Louis or points further west (where the workday has not yet started), then the 9:00 a.m. load spike can be attenuated. However, the existence of this interface for controlling call routing offers a point of entry for the eavesdropper, who can profit from exploiting that control.

Database Attacks. OSSs and the many databases they manage are employed to translate telephone numbers so that the number dialed by a subscriber is not necessarily the number that will be reached. If an attacker can compromise these databases, then various forms of abuse and deception become possible. The simplest such attack exploits network-based speed dialing, a feature that enables subscribers to enter a one- or two- digit abbreviation and have calls directed to a predefined destination. If the stored numbers are changed by an attacker, then speed-dialed calls could be routed to destinations of the attacker's choice. Beyond harassment, an attacker who can change speed dialing numbers can impersonate a destination or can redial to the intended destination while staying on the line and eavesdropping. Other advanced telephone services controlled by OSSs and databases include call forwarding, toll-free numbers, call distribution, conference calling, and message delivery. All could be affected by OSS and database vulnerabilities. In one successful attack, the database entry for the telephone number of the probation of


Public telephone network and internet trustworthiness 49

    











fice in Del Ray Beach, Florida, was reconfigured. People who called the probation office when the line was busy had their calls forwarded to a telephone sex line in New York (Cooper, 1989).18

Because a subscriber's chosen long-distance carrier is stored in a telephone network database, it too is vulnerable to change by attackers. Here the incentive is a financial one—namely, increased market share for a carrier. In a process that has come to be known as "slamming," customers' long-distance carriers are suddenly and unexpectedly changed. This problem has been pervasive enough so that numerous procedural safeguards have been mandated by the FCC and various state regulatory bodies.

Looking to the future, more competition in the local telephone market will lead to the creation of a database that enables the routing of incoming calls to specific local telephone carriers. And, given the likely use of shared facilities in many markets, outgoing local calls will need to be checked to see what carrier is actually handling the call. In addition, growing demand for "local number portability," whereby a customer can retain a telephone number even when switching carriers, implies the existence of one more database (which would be run by a neutral party and consulted by all carriers for routing of local calls). Clearly, a successful attack on any of these databases could disrupt telephone service across a wide area.

In contrast to the Internet, the telephone system does not depend on having an automated process corresponding to the Internet's DNS translation from names to addresses.19 One does not call directory assistance before making every telephone call, and success in making a call does not depend critically on this service. Thus, in the PTN, an Internet's vulnerability is avoided but at the price of requiring subscribers to dial telephone numbers rather than dialing subscriber names. Furthermore, unlike DNS, the telephone network's directory service is subject to a sanity test by its clients. If a human caller asks directory assistance for a neighbor's number and is given an area code for a town halfway across the country, the caller would probably doubt the accuracy of the number and conclude that the directory assistance service was malfunctioning. Still, tampering with directory assistance can cause telephone calls to be misdirected.

18There is even a historical precedent for such attacks. The original telephone switch was invented by an undertaker; his competitor's wife was a telephone operator who connected anyone who asked for a funeral home to her own husband's business.

19This is not strictly true; calls to certain classes of telephone numbers (e.g., 800, 888, and 900) do result in a directory lookup to translate the called number into a "real" destination telephone number. In these instances, the analogy between the PTN and the Internet is quite close.


50 trust in cyberspace

    











Facilities. The nature of the telephone company physical plant leads to another class of vulnerabilities. Many central offices normally are unstaffed and, consequently, they are vulnerable to physical penetration, which may go entirely undetected. Apart from the obvious problems of intruders tampering with equipment, the documentation present in such facilities (including, of course, passwords written on scraps of yellow paper and stuck to terminals) is attractive to "phone phreaks."20 A similar vulnerability is present in less populated rural areas, which are served by so-called remote modules. These remote modules perform local switching but depend on a central office for some aspects of control. Remote modules are invariably deployed in unstaffed facilities, hence subject to physical penetration.

Findings

1. Attacks on the telephone network have, for the most part, been directed at perpetrating billing fraud. The frequency of attacks is increasing, and the potential for more disruptive attacks, with harassment and eavesdropping as goals, is growing.

2. Better protection is needed for the many number translation and other databases used in the PTN.

3. SS7 was designed for a closed community of telephone companies. Deregulation has changed the operational environment and created opportunities for insider attacks against this system, which is fundamental to the operation of the PTN.

4. Telephone companies need to enhance the firewalls between OSSs and the Internet and safeguard the physical security of their facilities.

Attacks on the Internet

The general accessibility of the Internet makes it a highly visible target and within easy reach of attackers. The widespread availability of documentation and actual implementations for Internet protocols means that devising attacks for this system can be viewed as an intellectual puzzle (where launching the attacks validates the puzzle's solution). Internet vulnerabilities are documented extensively on CERT's Web site,21 and at least one Ph.D. thesis (Howard, 1997) is devoted to the subject.

20A phone phreak is a telephone network hacker.

21The Computer Emergency Response Team (CERT)/Coordination Center is an element of the Networked Systems Survivability Program in the Software Engineering Institute at Carnegie Mellon University. See <http://www.cert.org>.


Public telephone network and internet trustworthiness 51

    











This subsection concentrates on vulnerabilities in the Internet's infrastructure, since this is what is most relevant to NIS designers. Vulnerabilities in end systems are amply documented elsewhere. See, for example, Garfinkel and Spafford (1996).

Name Server Attacks. The Internet critically depends on the operation of the DNS. Outages or corruption of DNS root servers and other top-level DNS servers—whether owing to failure or successful attacks—can lead to denial of service. Specifically, if a top-level server cannot furnish accurate information about delegations of zones to other servers, then clients making DNS lookup requests are prevented from making progress. The client requests might go unanswered, or the server could reply in a way that causes the client to address requests to DNS server machines that cannot or do not provide the information being sought. Cache contamination is a second way to corrupt the DNS. An attacker who introduces false information into the DNS cache can intercept all traffic to a specific targeted machine (Bellovin, 1989). One highly visible example of this occurred in July 1997, when somebody used this technique to divert requests for a major Web server to his own machines (Wall Street Journal, 1997).

In principle, attacks on DNS servers are easily dealt with by extending the DNS protocols. One such set of extensions, Secure DNS, is based on public-key cryptography (Eastlake and Kaufman, 1997) and can be deployed selectively in individual zones.22 Perhaps because this solution requires the installation of new software on client machines, it has not been widely deployed. No longer merely a question of support software complexity, the Internet has grown sufficiently large so that even simple solutions, such as Secure DNS, are precluded by other operational criteria. A scheme that involved changing only the relatively small number of DNS servers would be quite attractive. But lacking that, techniques must be developed to institute changes in large-scale and heterogeneous networks.

Routing System Attacks. Routing in the Internet is highly decentralized. This avoids the vulnerabilities associated with dependence on a small number of servers that can fail or be compromised. But it leads to other vulnerabilities. With all sites playing some role in routing, there are many more sites whose failure or compromise must be tolerated. The

22However, configuration management does become much harder when there is partial deployment of Secure DNS.

52 trust in cyberspace

    











damage inflicted by any single site must somehow be contained, even though each site necessarily serves as the authoritative source for some aspect of routing. Decentralization is not a panacea for avoiding the vulnerabilities intrinsic in centralized services. Moreover, the trustworthiness of most NISs will, like the Internet, be critically dependent both on services that are more sensibly implemented in a centralized fashion (e.g., DNS) and on services more sensibly implemented in a decentralized way (e.g., routing). Understanding how either type of services can be made trustworthy is thus instructive.

The basis for routing in the Internet is each router periodically informing neighbors about what networks it knows how to reach. This information is direct when a router advertises the addresses of the networks to which it is directly connected. More often, though, the information is indirect, with the router relaying to neighbors what it has learned from others. Unfortunately, recipients of information from a router rarely can verify its accuracy23 because, by design, a router's knowledge about network topology is minimal. Virtually any router can represent itself as a best path to any destination as a way of intercepting, blocking, or modifying traffic to that destination (Bellovin, 1989).

Most vulnerable are the interconnection points between major ISPs, where there are no grounds at all for rejecting route advertisements. Even an ISP that serves a customer's networks cannot reject an advertisement for a route to those networks via one of its competitors—many larger sites are connected to more than one ISP.24 Such multihoming becomes a mixed blessing, with the need to check accuracy, which causes traffic addressed from a subscriber net arriving via a different path to be suspect and rejected, being pitted against the increased availability that multihoming promises. Some ISPs are now installing BGP policy entries that define which parts of the Internet's address space neighbors can provide information about (with secondary route choices). However, this approach undermines the Internet's adaptive routing and affects overall survivability.

Somehow, the routing system must be secured against false advertisements. One approach is to authenticate messages a hop at a time. A number of such schemes have been proposed (Badger and Murphy, 1996; Hauser et al., 1997; Sirois and Kent, 1997; Smith et al., 1997), and a major router vendor (Cisco) has selected and deployed one in products. Unfor

23In a few cases it actually is possible to reject inaccurate information. For example, an ISP will know what network addresses belong to its clients, and neighbors of such a router generally will believe that and start routing traffic to the ISP.

24The percentage of such multihomed sites in the Internet is currently low but appears to be rising, largely as a reliability measure by sites that cannot afford to be offline.


Public telephone network and internet trustworthiness 53

    











tunately, the hop-at-a-time approach is limited to ensuring that an authorized peer has sent a given message; nothing ensures that the message is accurate. The peer might have received an inaccurate message (from an authorized peer) or might itself be compromised. Thus, some attacks are prevented but others remain viable.

The alternative approach for securing the routing system against false advertisements is, somehow, for routers to employ global information about the Internet's topology. Advertisements that are inconsistent with that information are thus rejected. Schemes have been proposed (e.g., Perlman, 1988), but these do not appear to be practical for the Internet. Perlman's scheme, for example, requires source-controlled routing over the entire path. Routing protocol security is an active research area, and appropriately so.

Routing in the Internet is actually performed at two levels. Inside an autonomous system (AS)—a routing domain under the control of one organization—an interior routing protocol is executed by routers. Attacking these routers can affect large numbers of users, but wiretapping of these systems appears to be rare and therefore of limited concern.25 Of potentially greater concern are attacks on BGP, the protocol used to distribute routing information among the autonomous ISPs around the world. Because BGP provides the basis for all Internet connectivity, a successful attack can have wide-ranging effects. As above, it is easy to secure BGP against false advertisements on a hop-at-a-time basis and difficult to employ global information about topology. Moreover, even if false advertisements could be discarded, successful attacks against BGP routers or against the workstations used to download configuration information into the BGP routers could still have devastating effects on Internet connectivity.

To secure BGP against a full range of attacks, a combination of security features involving both the routers and a supporting infrastructure

25Attacks against an interior routing protocol or against an organization's routers can deny or disrupt service to all of the hosts within that AS. If the AS is operated by an ISP, then the affected population can be substantial in size. Countermeasures to protect link state intradomain routing protocols have been developed (Murphy and Hofacker, 1996) but have not been deployed, primarily because of concerns about the computational overhead associated with the signing and verification of routing traffic (specifically, link state advertisements). Countermeasures for use with distance vector algorithms (e.g., DVRP) are even less well developed, although several proposals for such countermeasures have been published recently. Because all of the routers within an AS are under the control of the same administrative entity, and because there is little evidence of active wiretapping of intra-AS links, there may be a perception that the proposed cryptographic countermeasures are too expensive relative to the protection afforded.

54 trust in cyberspace

    











needs to be developed and deployed. Each BGP router must be able to verify whether a routing update it receives is authentic and not a replay, or a previous, authentic update, where an authentic routing updat