Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 53
Scaling Up the Internet and Taking It
Fore Reliable and Robust
BUILDING A BETTER INTERNET
The Internet has become a place where many live, work, and play. It
is a critical resource for many businesses that depend on e-commerce.
Indeed, when attacks are made on Internet infrastructure or commonly
used Web sites like CNN, Yahoo! and the like, they become front-page
news.1 As a consequence, the Internet must become and remain more
robust and reliable. Reflecting demand for its capabilities, the Internet is
expected to grow substantially worldwide in terms of users, devices, and
applications. A dramatic increase in the number of users and networked
devices gives rise to questions of whether the Internet's present address-
ing scheme can accommodate the demand and whether the Internet
community's proposed solution, IPv6, could, in fact, be deployed to rem-
edy the situation. The l990s saw widespread deployment of telephony
and streaming audio and video. These new applications and protocols
have had significant impacts on the infrastructure, both quantitatively in
terms of a growing level of traffic and qualitatively in terms of new types
of traffic. The future is likely to see new applications that place new
demands on the Internet's robustness and scalability. In short, to meet
the potential demand for infrastructure, the Internet will have to support
1For example, Matt Richtel. 2000. "Several Web Sites Attacked Following Assault on
Yahoo." New York Times, February 9, p. A1; and Matt Richtel. 2000. "Spread of Attacks on
Web Sites Is Slowing Traffic on the Internet," New York Times, February 10, p. A1.
53
OCR for page 53
54
THE INTERNET'S COMING OF AGE
a dramatically increasing number of users and devices, meet a growing
demand for network capacity (scale), and provide greater robustness at
any scale.
SCALING
"Scaling" refers to the process of adapting to various kinds of Internet
growth, including the following:
· The increasing number of users and devices connected to the
Internet,
· The increasing volume of communications per device and total
volume of communication across the Internet, and
· The continual emergence of new applications and ways in which
users employ the Internet.
While details of the growth in Internet usage are subject to interpre-
tation and change over time, reflecting the dynamic nature of Internet
adoption, it is only the overall trends that concern us here. In the United
States, a substantial fraction of homes have access to the Internet, and
that number is likely to eventually approach the fraction of homes that
have a personal computer (a fraction that itself is still growing). Over
100 million people report that they are Internet users in the United States.2
Overseas, while the current level of Internet penetration differs widely
from country to country, many countries show rates of growth compa-
rable to or exceeding the rapid growth seen in the United States,3 so it is
reasonable to anticipate that similar growth curves will be seen in other
less-penetrated countries, shifted in time, reflecting when the early adop-
tion phase began.
Perhaps a more important future driver for overall growth is the
trend toward a growing number and variety of devices being attached to
the Internet. Some of those devices will be embedded in other kinds of
equipment or systems, and some will serve specific purposes for a given
user. This trend could change the number of devices per user from the
current number, slightly less than 1 in developed countries, to much more
than this 10 or even 100.
2Data from Computer Industry Almanac, available online at .
3For an analysis based on OECD data, see Gonzaolo Diez-Picazo Figuera. 1999. An
Analysis of International Internet Diffusion. Masters Thesis, MIT, June, p. 83.
OCR for page 53
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 55
Scaling of Capacity
The basic design of the Internet, characterized by the elements dis-
cussed in Chapter 1, has proved remarkably scalable in the face of such
growth. Perhaps the most obvious component of growth is the demand
for greater speed in the communications lines that make up the Internet.
As was noted in Chapter 1, the first major scaling hurdle was seen about
a decade ago when, in response to growing demands, many of the
56-kbps lines in the NSFNET backbone were replaced with higher capac-
ity 1.5-Mbps lines (also known as T1 lines).4 Doing so required develop-
ing higher performance Internet routers and some retuning of protocols
and software. Since then, the Internet has passed many scaling hurdles
and increased its capacity many times over. The fastest lines in the
Internet were 2.5 Gbps (OC-48) in 1999, almost 50,000 times faster than
the original lines, and the deployment of 10-Gbps lines (OC-192) is under
way.
All expectations are that more such growth will be seen in the coming
decade. There is a persistent and reasonable fear that demand for capac-
ity will outstrip the ability of the providers to expand owing to a lack of
technology or capital. The 1990s were characterized by periodic scram-
bling by ISPs, equipment providers, and researchers to develop and
deploy new technologies that would provide the needed capacity in ad-
vance of demand. The success of those efforts does not, however, guaran-
tee continued success into the future. Furthermore, efforts to expand
capacity may not be uniformly successful. Regional variations in the
availability of rights of way, industry strategies, and regulation could
slow deployments in particular areas.
Better use of existing bandwidth also plays a role in enhancing
scalability. A recent trend has been to compensate for the lack of network
capacity (or other functionality, such as mechanisms for assuring a par-
ticular quality of service) by deploying servers throughout the Internet.
Cache servers keep local copies of frequently used content, and locally
placed streaming servers compensate for the lack of guarantees against
delay. In some cases, innovative routing is used to capture requests and
direct them to the closest servers. Each of these approaches has side
Their implications for
effects that can cause new problems, however.
robustness and transparency are discussed elsewhere in this report.
4An abbreviation for bits per second is bps; kbps means thousands of bits per second,
Mbps means millions of bits per second, Gbps means billions of bits per second, and Tbps
means trillions of bits per second. The use of a capital B in place of lower case b means the
unit of measurement is bytes t8 bits' rather than bits.
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
OCR for page 53
56
THE INTERNET'S COMING OF AGE
Scaling of Protocols and Algorithms
A more difficult aspect of growth is in design of new or improved
protocols and algorithms for the Internet. The ever-present risk is that
solutions will be deployed that work for the moment but fail as the num-
ber of users and applications continues to grow today's significant im-
provement may be tomorrow's impediment to progress. Scaling must
thus be considered in every design. This lesson is increasingly important
as there are many pressures driving innovations that may not scale well
or at all.
The IETF processes through which lower-level network protocols are
developed involve extensive community review. This means that the
protocols undergo considerable scrutiny with regard to scaling before
they are widely deployed. However, particularly at the applications
layer, protocol proposals are sometimes introduced that, while adequate
in such settings as a local area network, have been designed without
sufficient understanding of their implications for the wider Internet.
Market pressures can then lead to their deployment before scaling has
been completely addressed. When a standard is developed through a
forum such as the IETF, public discussion of it in working groups helps.
However, a protocol can nonetheless reach the status of a "proposed
standard," and thus begin to be widely deployed, with obvious scalability
problems only partially fixed.
The Web itself is a good example of scaling challenges arising from
particular application protocols. It is not widely appreciated that the
"World Wide Wait" phenomenon is due in part to suboptimal design
choices in the specialized protocol used by the Web (HTTP), not to the
core Internet protocols. Early versions of HTTP relied on a large number
of short TCP sessions, adding considerable overhead to the retrieval of a
page containing many elements and preventing TCP's congestion control
mechanisms from working.5 An update to the protocol, HTTP 1.1,
adopted as an Internet standard by the IETF in 1999,6 finally fixed enough
of the problem to reduce the pressure on the network infrastructure, but
the protocol still lacks many of the right properties for use at massive
5Though it took some time to launch an update, the shortcomings of HTTP l.o were
recognized early on. see, for example, Simon E. spero. 1994. Analysis of HTTP Performance
Problems, Technical report. Cambridge, Mass.: World Wide Web consortium, July. Avail-
able online at .
6R Fielding et aL 1999. Hypertext Transfer Protocol HTTP/1.1. RFC 2616. Network
Working Group, Internet Engineering Task Force, June. Available online at
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 57
scale. The challenge posed by this lack of scalability has been significant
given HTTP's large share of Internet backbone traffic.7
The case of IP multicast demonstrates the interplay between protocol
design, the Internet's routing system, and scaling considerations. Multi-
cast is a significant example because it allows applications to simulta-
neously and inexpensively deliver a single data stream to multiple deliv-
ery points, which would alleviate Internet scaling challenges. Multicast
can be used in numerous applications where the same data are to be sent
to multiple users, such as audio and audiovisual conferencing, entertain-
ment broadcasting, and various other forms of broad information dis-
semination (the delivery of stock quotes to a set of brokers is one ex-
ample). All of these applications are capable of running over today's
Internet, either in the backbone or within corporate networks, but many
operate via a set of individual, simultaneous (unicast) transmissions,
which means that they use much more bandwidth than they might oth-
erwise.
Despite its promise of reducing bandwidth requirements for one-to-
many communications, multicast itself presents scaling challenges. By
definition, an Internet-wide multicast group needs to be visible through-
out the Internet or at least everywhere where there is a group's member.
The techniques available today require that routers track participation in
each active group, and in some case for each group's active senders. Such
participation tracking requires complex databases and supporting proto-
col exchanges. One might reasonably assume that the number of groups
grows with the size of the Internet or with the growth of applications such
as Internet radio broadcast, and that the footprint of each group (the
fraction of the Internet over which the group information must be trans-
mitted) will grow as the size of the Internet. However, the two factors
multiply, meaning that under these assumptions, the challenges posed to
providers will grow as the square of the Internet's size. Resolving this
situation requires not merely defining an appropriate protocol but also
researching a hard routing question how to coalesce routing informa-
tion of multiple groups into manageable aggregates without generating
too much inefficiency.
7For example, Internet traffic statistics for the vBNS, a research backbone, show that
about two-thirds of TCP flows were HTTP. see MCI vBNS Engineering. 2000. NSF Very
High Speed Backbone Network Service: Management and Operations Monthly Report, January.
Available online at .
58
THE INTERNET'S COMING OF AGE
Scaling of the Internet's Naming Systems
Growth in the number of names and an increasing volume of name
resolution requests, both of which reflect Internet growth, are placing
scaling pressures on the Internet's name-to-address translation service,
the Domain Name System (DNS).8 There is broad consensus as well as a
strong technical argument that a common naming service is needed on
the Internet.9 People throughout the world need to be able to name ob-
jects (systems, files, and facilities) correctly in their own languages and
have them unambiguously accessible to authorized people under those
names, which requires a common naming infrastructure. People also
need naming services to allow them to identify applications and services
provided by particular companies and organizations.
The DNS is instrumental in hiding the Internet's internal complexity
from users and application developers. In the DNS, network objects such
as the host computers that provide Web pages or e-mail boxes are desig-
nated by symbolic names that are independent of the location of the re-
source. The name provides an indirect reference to the network object,
which allows the use of names instead of less mnemonic numbers and
also allows the actual address to which the name points to be changed
without disrupting access via the name. Because the computer associated
with a particular named service can be changed without changing the IP
addresses of that machine (only the address associated with the name in
the DNS needs changing), indirection provides users with portability if
they wish to switch Internet providers. While most users receive IP ad-
dress allocations from their ISP and thus have to change address if they
change ISP, DNS names are controlled by the user a change of provider
requires only that the address pointed to by the DNS entry be changed.
The significance of DNS names was greatly increased as a result of the
decision by the original developers of the World Wide Web to use them
directly to identify information locations. The importance attached to
DNS names is reflected in the contention surrounding the system's man-
agement (Box 2.1)
The DNS is organized as a hierarchy. At the very top of the hierarchy,
the "root servers" record the address of the top-level domain servers,
such as the .com or .uk servers (Figure 2.1~. The addresses of these root
8The DNS was first introduced in P. Mockapetris. 1983. Domain Names - Concepts and
Facilities, R;FC 882. November. Available online at .
9Internet Architecture Board. 2000. IAB Technical Comment on the Unique DNS Root, R;FC
2826. May. Available online at .
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 59
generic top-level domains
levPel COM ORG EDU NET GOV MIL
country top-level domains
US UK ·--
level STANFORD MIT
2
level CS
3
. . .
FIGURE 2.1 DNS hierarchy.
servers are known locally to every name server of the Internet, using
information provided by ICANN (in practice, coded into the DNS soft-
ware by the vendor). Each top-level domain server records the addresses
of the domain name servers for the second-level domains, such as
example.com. These secondary servers are responsible for providing in-
formation on name-to-address mappings for names in the example.com
domain. The hierarchical design permits the secondary servers to point
themselves to third-level servers, and so forth. To access named objects,
Internet sessions start with a transaction with a name server, known as
name resolution, to find the IP address at which the resource is located, in
which a domain name such as www.example.com is translated into a
numerical address such as 128.9.176.32. Assuming that the local name
server has not previously stored the requisite information locally (see the
discussion of caching, below), three successive transactions are generally
required in order to find the address of a target server such as www.
example.com: (1) to learn the address of the .com server from the root
server, (2) to learn the address of the example.com server from the .com
server, and (3) to learn the address of the target web server, www.
example.com, from the example.com name server.
The situation in practice may, in fact, be more complicated. If
example.com is a very popular service, it is useful to be able to distribute
the load among multiple servers and/or to direct a user to the server that
is closest to him. To do either of these, the name servers run by example.
com may make use of a clever trick: requests for the address correspond-
ing to www.example.com, for example, may produce replies pointing to
60
THE INTERNET'S COMING OF AGE
one of a number of different servers that, presumably, contain copies of
the same informational
The rules governing DNS names would seem to permit millions of
naming domains each containing billions of names,ll which would seem
adequate to support scaling demands. However, with the number of top-
1OAnother non-DNS trick for load distribution makes use of so-called transparent proxies
or interception proxies. These intercept and divert data packets going to a particular ad-
dress to one of a number of servers that contain the same content. Because it interposes
information processing outside the control of either the user's computer or the server he is
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 61
level domains currently limited to one national domain per country (e.g.,
.fr for France), plus a limited number of global domains (e.g., .com and
.org), many domains are organized with a very large number of names
contained at the next level rather than by distributing names further down
connecting to, this technique runs counter to the end-to-end principle and can sometimes
have the side effect of delivering inconsistent information to the user.
1lEach DNS name can be composed of up to 256 characters and up to 64 naming ele-
ments, each of which can be made of up to 64 characters (letters, digits, and hyphen).
62
THE INTERNET'S COMING OF AGE
in the hierarchy (e.g., using product.example.com instead of product.
com). This can cause scaling problems, and there are concerns that the
performance of the DNS will worsen over time.
The multistage process required to find the address of a target, re-
peated for many Web page accesses by millions of Internet users, can
result in a heavy load on the servers one level down from the top of the
tree. If the name servers were to be overwhelmed on a persistent basis, all
Internet transactions that make use of domain names (i.e., virtually all
Internet transactions) would be slowed down, and the whole Internet
would suffer.
Today's DNS design relies on two mechanisms to cope with this
load caching and replication. These mechanisms have been effective in
alleviating scaling pressures, but there are signs that they may not be
sufficient to cope with the continuing rapid growth of the network. DNS
caching is a technique whereby the responses to common queries are
stored on local DNS servers. Applications such as Web browsers also
may perform DNS caching. Using caching, a local DNS server need
only request the addresses of the .com server from the root servers infre-
quently rather than repeatedly. Similarly, once a request has been made
for the address of the example.com server, the local name server need not
ask for this information again for a period of time known as the "time to
live." Because of the dynamic nature of DNS information, name servers
return not only an address but a time-to-live parameter selected by the
administrator of the name server for the relevant domain, usually on the
order of days or hours, which indicates how long the name-to-address
mapping can be considered valid helping ensure that servers do not
retain outdated information.
1 a
Caching works well when the same request is repeated many times.
This is the case for high-level queries, such as requesting the address of
the .com name servers, and also for the most popular Web servers, the
search engines, and the very large sites. (It works even better for very
frequently accessed services like a file server on a local area network.)
However, the efficiency of caching decreases as the number of names that
Achy do applications also need to cache DNS names? Good DNS performance depends
on having local access to DNS information. Because the target platform was a tiny, diskless
machine, the earliest implementations of TCP/IP software for the IBM PC lacked DNS
cache functionality and depended on local LAN access to a DNS server for all name resolu-
tion requests. This resolver-only design has persisted in a number of machines today. Not
only does this force the application designer to implement DNS caching, but there are
performance costs as well. Since an application cannot determine whether the host it is
running on supports a caching server, application-layer caching makes it possible for cach-
ing to be carried out twice, potentially yielding inconsistent results.
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 63
are kept active by a given user or domain name server increases. When
millions of names are registered and accessed in the DNS, only a small
fraction can be present in any given cache. Requests for the names of less
frequently requested sites, which in total will represent a significant frac-
tion of all requests, will have to be forwarded to the DNS. Even if user
queries are concentrated mostly on large sites or queries from the same
local group of hosts are concentrated on the same group of sites, which
may or may not be the case, the remaining fraction still constitutes an
important and growing burden for the DNS. The effect of cache misses is
made even worse by the concentration of names in a small number of
popular top-level domains, such as .com, .net, and .org. Consequently, an
inordinate fraction of the load is sent to these domains' servers (a load
that could be alleviated if the hierarchical design of the DNS were used to
limit the number of highest-level names). These servers need to scale in
two ways. They must support an ever-growing name population, which
means that the size of their database keeps increasing very quickly, and
they must serve ever more frequent queries. The growth of the database
implies increased memory requirements and an increased management
load.
Replication, whereby name databases are distributed to multiple
name servers, is a way of sharing the load and increasing reliability. With
replication, the root server is able, for example, to provide the addresses
of several .com servers instead of one. The volume of name resolution
inquiries could be met by splitting the load across a sufficiently large
number of replicated servers. Unfortunately, current DNS technology
limits this approach because the list of the names and addresses of all the
servers for a given domain must fit into a single 512-byte packet. (Even
after efforts were made to shorten host names, the number of root servers
remains limited to 13.) Once the maximum number of servers that will fit
within the single-packet constraint has been deployed, increased load in
that domain can only be dealt with by increasing the capacity and pro-
cessing power of each of the individual .com name servers. While the
performance of the most widely used DNS software, BIND, lags that of
modern high-performance database systems and root servers' software
can almost certainly be improved to handle much higher loads, Internet
growth rates suggest that the demand on the root servers is likely to be
growing faster than their processing speed is increasing and that in a few
years the root servers could nonetheless be heavily overloaded.
One proposal for addressing issues ranging from scaling to DNS
name-trademark conflicts is to move toward a solution that makes use of
directories as an intermediate layer between applications and the DNS. A
directory might help resolve conflicts between DNS names and registered
trademarks because a particular keyword could be associated with mul-
96
THE INTERNET'S COMING OF AGE
Changes in the telecommunications industry led the FCC, in 1998, to
ask the Network Reliability and Interoperability Council (NRIC) IV to
explore reliability concerns in the wider set of networks (e.g., telephone,
cable, satellite, and data, including the Internet) that the PSTN is part of.
The report of the NRIC IV subcommittee looking at needs for data on
service outages54 called for a trial period of outage reporting. NRIC V,
chartered in 2000, has initiated a 1-year voluntary trial starting in Septem-
ber 2000 and will monitor the process, analyze the data obtained from the
trial, and report on how well the process works.55
ISPs are not, at present, mandated to release such information. In-
deed, the release of this type of information is frequently subject to the
terms of private agreements between providers. This situation is not
surprising, given the absence of regulation of the Internet and the high
degree of regulation of the telephone industry. As the Internet becomes
an increasingly important component of our society, there will probably
be calls to require reporting on overall reliability and specific disruptions.
It is not now clear what metrics should be used and what events should
be reported, what the balance between costs and benefits would be for
different types of reporting, or what the least burdensome approach to
this matter would be. One response to rising expectations would be for
Internet providers to work among themselves to define an industry ap-
proach to reporting. Doing so could have two benefits it might provide
information useful to the industry and it might avoid government impo-
sition of an even-less-welcome plan.
As noted above, one important reason for gathering information on
disruptions is to provide researchers with the means to discover the root
causes of such problems. For this to be effective, outage data must be
available to researchers outside the ISPs; ISPs do not generally have re-
search laboratories and are not necessarily well placed to carry out much
of the needed analysis of the data much less design new protocols or
build new technologies to improve robustness. Also, data should not be
anonymized before they are provided to researchers; the anonymity hides
information (e.g., on the particular network topology or equipment used)
54see Network Reliability and Interoperability Council ENRICH. 2000. Network Reliability
Interoperability Council IV, Focus Group 3, Subcommittee 2, Data Analysis and Future Consider-
ations Team. Washington, D.C.: NRIC, Federal communications commission federal advi-
sory committee, p. 4. Available online at .
55see Network Reliability and Interoperability Council ENRICH. 2000. Revised Network
Reliability and Interoperability Council - V Charter. Washington, DC: NRIC, Office of Engi-
neering and Technology, Federal communications commission. Available online at
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 97
from the researcher. However, in light of proprietary concerns attached
to the release of detailed information, researchers must agree not to dis-
close proprietary information (and must live up to those agreements).
Disclosure control in published reports is not simply a matter of
anonymizing the results; particular details may be sufficient to permit the
reader of a research report, including an ISP's competitors, to identify the
ISP in question. Attention must, therefore, also be paid to protecting
against inadvertent disclosure of proprietary information.
Looking to the future, the committee can see other reasons why ISPs
would benefit from sorting out what types of reliability metrics should
be reported. For example, it is not hard to imagine that at some point
there would be calls from high-end users for a more reliable service that
spans the networks of multiple ISPs and that some of the ISPs would
decide to work together to define an "industrial-strength" Internet ser-
vice to meet this customer demand. When they interconnect their net-
works, how would they define the service that they offer? Since the
performance experienced by an ISP's customer depends on the perfor-
mance of all the networks between the customer and the application or
service the customer is using, each ISP would have an interest in ensur-
ing that the other ISPs live up to reliability standards. Absent a good
source of data on failures (and a standardized framework for collecting
and reporting on failures), how would the ISPs keep tabs on each other?
In the process of defining a higher-grade service, ISPs will want to un-
derstand what sort of failure would degrade the service, and it is this sort
of failure that they ought to be reporting on. From this perspective,
outage reporting shifts from being a mandated burden to an enabler of
new business opportunities.
It is unlikely that simple, unidimensional measures that summarize
ISP performance would prove adequate. Creating standard reporting or
rating models for the robustness and quality of ISPs would tend to limit
the range of services offered in the marketplace. What form might such
user choices take? Consider, as an example, that an ISP that experiences
the failure of a piece of equipment might face a tough trade-off. It could
continue to operate its network at reduced performance in this condition
or undergo a short outage to fix the problem a choice between an
extended period of uptime at much reduced performance and a short
outage that restores performance to normal. Some of the ISP's custom-
ers e.g., those who depend on having a connection rather than on the
particular quality of that connection will prefer the first option, while
others will prefer the second. Indeed, some may be willing to pay extra to
get a service that aims to provide a particular style of degraded service.
(Such a "guaranteed style of degradation" is an interesting variation on
QOS and does not impose much overhead.) These considerations suggest
98
THE INTERNET'S COMING OF AGE
that, more generally, there is a need for many different rating scales or,
put another way, a need for measuring several different things that might
be perceived as "quality" or reliability. Combining them into a single
metric does not serve the interests of different groups (user or vendor or
both) that are likely to prefer different weighting factors or functions for
combining the various measures.
QUALITY OF SERVICE
The Internet's best-effort quality of service (QOS) makes no guaran-
tees about when, or whether, data will be delivered by the network. To-
gether with the use of end-to-end mechanisms such as the Transmission
Control Protocol (TCP), which provides capabilities for reassembling in-
formation in proper order, retransmitting lost packets, and ensuring com-
plete delivery, best effort been successful in supporting a wide range of
applications running over the Internet. However, unlike Web browsing,
e-mail transmission, and the like, some applications such as voice and
video are very time-sensitive and degrade when the network is congested
or when transmission delays (latency) or variations in those delays (jitter)
are excessive. Some performance issues, of course, are due to overloaded
servers and the like, but others are due to congestion within the Internet.
Interest in adding new QOS mechanisms to the Internet that would tailor
network performance for different classes of application as well as inter-
est in deploying mechanisms that would allow ISPs to serve different
groups of customers in different ways for different prices have led to the
continued development of a range of quality-of-service technologies.
While QOS is seeing limited use in particular circumstances, it is not
widely employed.
The technical community has been grappling with the merits and
particulars of QOS for some time; QOS deployment has also been the
subject of interest and speculation by outside observers. For example,
some ask whether failure to deploy QOS mechanisms represents a missed
opportunity to establish network capabilities that would foster new ap-
plications and business models. Others ask whether introducing QOS
capabilities into the Internet would threaten to undermine the egalitarian
quality of the Internet whereby all content and communications across
the network receive the same treatment, regardless of source or destina-
tion that has been the consequence of best-effort service.
Beyond the baseline delay due to the speed of light and other irreduc-
ible factors, delays in the Internet are caused by queues, which are an
intrinsic part of congestion control and sharing of capacity. Congestion
occurs in the Internet whenever the combined traffic that needs to be
forwarded onto a particular outgoing link exceeds the capacity of that
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 99
link, a condition that may be either transient or sustained. When conges-
tion occurs in the Internet, a packet may be delayed, sitting in a router's
queue while waiting its turn to be sent on, and will arrive later than a
packet not subjected to queuing, resulting in latency. litter results from
variations in the queue length. If the queue fills up, packets will be
dropped.
In today's Internet, which uses TCP for much of its data transport,
systems sending data are supposed to slow down when congestion oc-
curs (e.g., the transfer of a Web page will take longer under congested
conditions). When the Internet appears to be less congested, transfers
speed up and applications complete their transaction more quickly. Be-
cause the adaptation mechanisms are based on reactions to packet loss,
the congestion level of a given link translates into a sufficiently large
packet loss rate to signal the presence of congestion to the applications
that share the link. Congestion in many cases only lasts for the transient
period during which applications adapt to the available capacity, and it
reaches drastic levels only when the capacity available to each application
is less than the minimum provided by the adaptation mechanism.
Congestion is generally understood to be rare within the backbone
networks of major North American providers, although it was feared
otherwise in the mid-199Os, when the Internet was commercialized. In-
stead, it is more likely to occur at particular network bottlenecks. For
example, links between providers are generally more congested than those
within a provider's network, some very much so. Persistent congestion
is also observed on several international links, where long and variable
queuing delays, as well as very high packet loss rates, have been mea-
sured.56 Congestion is also frequent on the links between customers' local
area networks (or residences) and their ISPs; sometimes it is feasible to
increase the capacity of this connection, while in other cases a higher
capacity link may be hard to obtain or too costly. Where wireless links are
used, the services available today are limited in capacity, and wireless
bandwidths are fundamentally limited by the scarcity of radio spectrum
assigned to these services as well as vulnerable to a number of impair-
ments inherent in over-the-air communication.
At least some congestion problems can be eliminated by increasing
the capacity of the network by adding bandwidth, especially at known
56See V. Paxson. 1999. "End-to-End Internet Packet Dynamics," IEEE/ACM Transactions
on Networking 7~3~:277-292, June. Logs of trans-Atlantic traffic available online at
100
THE INTERNET'S COMING OF AGE
bottlenecks. Adding bandwidth does not, however, guarantee that con-
gestion will be eliminated. First, the TCP rate-adaptation mechanisms
described above may mask pent-up demand for transmission, which will
manifest itself as soon as new capacity is added. Second, on a slightly
longer timescale, both content providers and users will adjust their usage
habits if things go faster, adding more images to Web pages or being more
casual about following links to see what is there and so on. Third, on a
longer timescale (on the order of months but not years), new applications
can emerge when there is enough bandwidth to enough of the users to
make them popular. This has occurred with streaming audio and is likely
to occur with streaming video in the near future.
Also, certain applications notably, real-time voice and video re-
quire controlled delays and predictable transfer rates to operate accept-
ably. (Streaming audio and video are much less sensitive to brief periods
of congestion because they make use of local buffers.) Broadly speaking,
applications may be restricted in their usefulness unless bandwidth is
available in sufficient quantity that congestion is experienced very rarely
or new mechanisms are added to ensure acceptable performance levels.
A straightforward way to reduce jitter is to have short queue lengths, but
this comes at the risk of high loss rates when buffers overflow. QOS
mechanisms can counteract this by managing the load placed on the queue
so that buffers do not overflow However, the situation in the Internet,
with many types of traffic competing in multiple queues, is complex.
Better characterization of network behavior under load may provide in-
sights into how networks might be engineered to improve performance.
Concerns in the past about being able to support multimedia appli-
cations over the Internet led to the development of a variety of explicit
mechanisms for providing different qualities of service to different ap-
plications e.g., best effort for Web access and specified real-time service
quality for audio and video.57 Today, two major classes of QOS support
different kinds of delay and delivery guarantees (see Box 2.4~. They are
based on the assumption that applications do not all have the same re-
quirements for network performance (e.g., latency, jitter, or priority) and
57In essence, these proposed QOS technologies resemble those that have proven effective
in ATM and Frame Relay networks, with the exception that they are applied to individual
application sessions or to aggregates of traffic connecting sets of systems running sets of
applications rather than to individual circuits connecting pairs of systems. The mathemati-
cal difference between IP QOS and ATM QOS is that ATM sends variable-length bursts of
cells, while IP sends variable-length messages. The biggest operational difference is that
ATM QOS is generally used in ATM networks carrying real-time traffic, while QOS is
generally not configured in IP networks today.
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 101
that the network should provide classes of service that reflect these dif-
ferences.58
There is significant disagreement among experts (including the ex-
perts on this committee) as to how effective quality-of-service mecha-
nisms would be and which would be more efficient, investing in addi-
tional bandwidth or deploying QOS mechanisms. One school of thought,
which sees a rising tide of quality, argues that increasing bandwidth in
the Internet will provide adequate performance in many if not most cir-
cumstances. As higher capacity links are deployed, the argument goes,
Internet delays will tend to approach the theoretical limit imposed by the
propagation of light in optical fibers, and the average bandwidth avail-
able on any given connection will increase. As the overall quality in-
creases, it will enable more and more applications to run safely over the
Internet, without requiring specific treatment, in the same way that a
rising tide as it fills a harbor can lift ever-larger boats. Voice transmission,
for example, is enabled if the average bandwidth available over a given
connection exceeds a few tens of kilobits per second and if the delays are
less than one-tenth of a second, conditions that are in fact already true for
large business users; interactive video is enabled if the average band-
width exceeds a few hundred kilobits per second, a performance level
that is already obtained on the networks dedicated to connecting univer-
sities and research facilities. If these conditions were obtainable on the
public Internet (e.g., if the packet loss rate or jitter requirements for tele-
phony were met 99 percent of the time), business incentives to deploy
QOS for multimedia applications would disappear and QOS mechanisms
might never be deployed.
Proponents of the rising tide view further observe that the causes of
jitter within today's Internet are poorly understood, and that investment
in better understanding the reasons for this behavior might lead to an
understanding of what improvements might be made in the network as
well as what QOS mechanisms would best cope with network congestion
and jitter if tweaking the network is not a sufficient response.
There are, however, at least some places within the network where
there is no tide of rising bandwidth, and capacity is intrinsically scarce.
One example is the more expensive and limited links between local area
networks (or residences) and the public network. Even here, however,
58This presumes, of course, that one should meet the full range of requirements in a
single infrastructure with a single switching environment. This is not necessarily an opti-
mal outcome; while the Internet has been able to support a growing set of service classes
within a single network architecture, it is an open question what network models would
best support the broad range of communications service profiles.
02
THE INTERNET'S COMING OF AGE
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 103
104
THE INTERNET'S COMING OF AGE
some will argue that it is better to invest in increased capacity of the
gateway link than in mechanisms to allocate scarce bandwidth. As noted
above, wireless links are inherently limited in capacity and are therefore
candidates for QOS. Prospects for the use of Internet QOS technologies in
this context depend in part on whether QOS services are provided at the
Internet protocol layer or through specialized mechanisms incorporated
into the lower-level wireless link technology. Current plans for third-
generation wireless services favor the latter approach, suggesting that
this may not be a driver of Internet QOS.
Service quality, like security, is a weak-link phenomenon. Because
the quality experienced over a path through the Internet will be at least as
bad as the quality of the worst link in that path, quality of service may be
most effective when deployed end to end, on all of the links between
source and destination, including across the networks of multiple ISPs. It
may be the case that localized deployment of QOS, such as on the links
between a customer's local area network and its ISP, would be a useful
alternative to end-to-end QOS, but the effectiveness of this approach and
the circumstances under which it would prove useful are open questions.
The reality of today's Internet is that end-to-end enhancement of QOS
is a dim prospect. QOS has not been placed into production for end-to-
end service across commercial ISP networks. Providing end-to-end QOS
requires ISPs to agree as a group on multiple technical and economic
parameters, including on technical standards for signaling, on the seman-
tics of how to classify traffic and what priorities they should be assigned,
and on the addition of complex QOS considerations to their interconnec-
tion business contracts. Perhaps more significantly, the absence of com-
mon definitions complicates the process of negotiating QOS across all of
the providers involved end to end. ISP interest in differentiating their
service quality from that of their competitors is another potential disin-
centive to interprovider QOS deployment.
There are also several technical obstacles to deployment of end-to-
end QOS across the Internet. One challenge is associated with the routing
protocols used between network providers (e.g., Border Gateway Proto-
col, or BGP). While people have negotiated the use of particular methods
for particular interconnects, there are no standardized ways of passing
QOS information, which is needed for reliable voice (or other latency-
sensitive traffic) transport between provider domains. Also, today's rout-
ing technology provides limited control over which peering points inter-
provider traffic passes through, owing to a lack of symmetric routing and
the complexities involved in managing the global routing space.
Exchanging latency-sensitive traffic (such as voice) will, at a minimum,
require careful attention to interconnect traffic growth and routing
configurations.
SCALING UP THE INTERNET AND MAKING IT MORE RELIABLE AND ROBUST 105
While the original motivation for developing quality-of-service
mechanisms was support of multimedia, another factor has been respon-
sible for a sizable portion of recent interest in quality of service: ISPs that
wish to value-stratify their users, that is, to offer those customers who
place a higher value on better service a premium-priced service, need
mechanisms to allow them to do so. In practice, this may be achieved by
mechanisms to allocate relative customer dissatisfaction, degrading the
service of some to increase that of others. (Anyone who has flown on a
commercial airliner understands the basic principle: lower-fare-paying
customers in coach have fewer physical comforts than their fellow travel-
ers in first class, but they all make the same trip.) Value stratification may
be of particular interest in situations where there is a scarcity of band-
width and thus an interest in being able to charge customers more for
increased use, but value stratification may also find use under circum-
stances where ISPs are able to provision sufficient capacity to meet the
demands of their customers and customers perceive enough value in a
premium service to pay more for it.
There is a central tension in the debate over QOS. If the providers, in
order to make their customers happy, add enough capacity to carry the
imposed load, why would one need more complex allocation schemes?
Put another way, if there is no overall shortage of capacity, all that can be
achieved by establishing allocation mechanisms is to allocate relative dis-
satisfaction. Would providers intentionally underprovision certain classes
of users? As indicated above, the answer may be yes under certain mar-
keting and business plans. Such differentiation of service packages and
pricing are sustainable inasmuch as customers perceive differences and
are willing to pay the prices charged.
One consequence of the development of mechanisms that enable dis-
parate treatment of customer Internet traffic has been concern that they
could be used to provide preferential support for both particular custom-
ers and certain content providers (e.g., those with business relationships
with the ISP).59 What, for instance, would better service in delivery of
content from preferred providers imply for access to content from provid-
ers without such status? What people actually experience will depend
not only on capabilities possible from the technology and the design of
marketing plans but also on what customers want from their access to the
Internet and what capabilities ISPs opt to implement in their networks.
59See, for example, Center for Media Education. 2000. What the Market Will Bear: Cisco's
Vision for Broadband Internet. Washington, D.C.: Center for Media Education. Available
online at .
106
THE INTERNET'S COMING OF AGE
The debate over quality of service has been a long-standing one within
the Internet community. Over time, it has shifted from its original focus
on mechanisms that would support multimedia applications over the
Internet to mechanisms that would support a broader spectrum of poten-
tial uses. These uses range from efficiently enhancing the performance of
particular classes of applications over constrained links to providing ISPs
with mechanisms for value-stratifying their customers. The committee's
present understanding of the technology and economics of the Internet
does not support its reaching a consensus on whether QOS is, in fact, an
important enabling technology. Nor can it be concluded at this time
whether QOS will see significant deployment in the Internet, either over
local links, within the networks of individual ISPs, or more widely, in-
cluding across ISPs.
Research aimed at better understanding network performance, the
limits to the performance that can be obtained using best-effort service,
and the potential benefits that different QOS approaches could provide in
particular circumstances is one avenue for obtaining a better indication of
the prospects for QOS in the Internet. Another avenue is to accumulate
more experience with the effectiveness of QOS in operational settings;
here the challenge is that deployment may not occur without demon-
strable benefits, while demonstrating those benefits would depend at least
in part on testing the effectiveness of QOS under realistic conditions.