Ongoing efforts to develop and deploy improved networking technologies promise to greatly enhance the capabilities of the Internet. Protocols for quality of service (QOS) could enable vendors to offer guarantees on available bandwidth and latencies across the network. Advanced security protocols may better protect the confidentiality of messages sent across the network and ensure that data are not corrupted during transmission or storage. Broadband technologies, such as cable modems and digital subscriber line (DSL) services, have the potential to make high-speed Internet connectivity more affordable to residential users and small businesses. In combination, these capabilities will enable the Internet to support an ever-increasing range of applications in domains as disparate as national security, entertainment, electronic commerce, and health care.
The health community stands to benefit directly from improvements in QOS, security, and broadband technologies, even though it will not necessarily drive many of these advances. Health applicationswhether supporting consumer health, clinical care, financial and administrative transactions, public health, professional education, or biomedical researchare not unique in terms of the technical demands they place on the Internet: nearly all sectors have some applications that demand enhanced QOS, security, and broadband technologies. Nevertheless, particular health applications require specific capabilities that might not otherwise receive much attention. Use of the Internet for video-based consultations with patients in their homes, for example, would call for two-way, high-bandwidth connections into and out of individual residences, whereascontinue
video-on-demand applications require high bandwidth in only one direction. Human life may be at risk if control signals sent to medical monitoring or dosage equipment are corrupted or degraded, or if electronic medical records cannot be accessed in a timely fashion. Even when no lives are at stake, the extreme sensitivity of personal health information could complicate security considerations, and the provisions of health care at the point of needwhether in the hospital, home, or hotel roomcould increase demand for provider and consumer access to Internet resources via a variety of media.
This chapter reviews current efforts to improve the capabilities of the Internet and evaluates them on the basis of the needs of the health sector outlined in Chapter 2. Particular attention is paid to the need for QOS, for security (including confidentiality of communications, system access controls, and network availability), and for broadband technologies to provide end users with high-speed connectivity to the Internet. Also discussed are privacy-enhancing technologies, which are seen by many as a prerequisite for more extensive use of the Internet by consumers. The chapter identifies ways in which the Internet's likely evolution will support health applications and ways in which it may not. It gives examples of challenges that real-world health applications can pose for networking research and information technology research more generally. In this way, it attempts to inform the networking research community about the challenges posed by health applications and to educate the health community about the ways in which ongoing efforts to develop and deploy Internet technologies may not satisfy all their needs.
Quality of Service
Quality of service is a requirement of many health-related applications of the Internet. Health organizations cannot rely on the Internet for critical functions unless they receive assurances that information will be delivered to its destination quickly and accurately. For example, care providers must be able to retrieve medical records easily and reliably when needed for patient care; providers and patients must be able to obtain sustained access to high-bandwidth services for remote consultations if video-based telemedicine is to become viable. In emergency care situations, both bandwidth and latency may be critical factors because providers may need rapid access to large medical records and images from disparate sources connected to the Internet. Other applications, such as Internet-based telephony and business teleconferencing, demand similar technical capabilities, but the failure to obtain needed QOS in a health application might put human life at risk.
Compounding the QOS challenge in health care is the variability of acontinue
health care organization's needs over the course of a single day. The information objects that support health care vary substantially in size and complexity. While simple text effectively represents the content of a care provider's notes, consultation reports, and the name-value pairs of common laboratory test results, many health problems require the acquisition and communication of clinical images such as X rays, computed tomography (CT), and magnetic resonance imaging (MRI). The electronic forms of these images, which often must be compared with one another in multiple image sets, comprise tens to hundreds of megabytes of information that may need to be communicated to the end user within several seconds or less. Medical information demands on digital networks are thus notable for their irregularity and the tremendous variation in the size of transmitted files. When such files need to be transmitted in short times, very high bandwidths may be required and the traffic load may be extremely bursty.
No capabilities have yet been deployed across the Internet to ensure QOS. Virtually all Internet service providers (ISPs) offer only best-effort service, in which they make every effort to deliver packets to their correct destination in a timely way but with no guarantees on latency or rates of packet loss. Round-trip times (or latencies) for sending messages across the Internet between the East and West Coasts of the United States are generally about 100 milliseconds, but latencies of about 1 second do occurand variations in latency between 100 milliseconds and 1 second can be observed even during a single connection.1 Such variability is not detrimental to asynchronous applications such as e-mail, but it can render interactive applications such as videoconferencing unusable. Similarly, the rates of packet loss across the Internet range from less than 1 percent to more than 10 percent; high loss rates degrade transmission quality and increase latencies as lost packets are retransmitted. Furthermore, because many applications attempt to reduce congestion by slowing their transmission rates, packet loss directly affects the time taken to complete a transaction, such as an image transfer, over the network.
Several approaches can be taken to improve QOS across the Internet, with varied levels of effectiveness. For example, Internet users can upgrade their access lines to overcome bottlenecks in their links to ISPs, but such efforts affect bandwidth and latency into and out of their own site only. They provide no means for assuring a given level of QOS over any distance. Similarly, ISPs can attempt to improve service by expanding the capacity of their backbone links. However, as described below, such efforts provide no guarantees that bandwidth will be available when needed and contain no mechanisms for prioritizing message traffic in the face of congestion. To overcome these limitations, efforts are under way to develop specific protocols for providing QOS guarantees across thecontinue
Internet. These protocols promise to greatly expand the availability of guaranteed services across the Internet, but their utility in particular applications may be limited, as described below.
One approach taken by ISPs to improve their data-carrying capacity and relieve congestion across the Internet has been to dramatically increase the bandwidth of the backbones connecting points of presence (POPs).2 Today's backbone speeds are typically on the order of 600 megabits per second (Mbps) to 2.5 gigabits per second (Gbps), but some ISPs have considerably more bandwidth in place. A number of ISPs today have tens of strands of fiber-optic cable between their major POPs, with each strand capable of carrying 100 wavelengths using current wavelength division multiplexing (WDM) technology. Each wavelength can support 2.5 to 10 Gbps using current opto-electronics and termination equipment. Thus, an ISP with 30 strands of fiber between two POPs theoretically could support 30 terabits per second (Tbps) on a single inter-POP trunk line.3 This is enough capacity to support approximately 450 million simultaneous phone calls or to transmit the 40 gigabyte (GB) contents of the complete MEDLARS collection of databases in one-hundredth of a second.
Even with this fiber capacity in the ground, most ISPs currently interconnect their POPs at speeds significantly lower than 1 Tbpsa situation that is likely to persist for the next few years. The limiting factors are the cost and availability of the equipment that needs to be connected to the fiber inside the POP. This equipment includes Synchronous Optical NETwork (SONET)4 termination equipment and the routers or switches that are required to forward packets between POPs. The SONET equipment is expensiveas are the routers and switches that connect to the SONET equipmentso ISPs have an incentive to deploy only enough to carry the expected traffic load. More importantly, routers are limited in terms of the amount of traffic they can support. As of late 1999, the leading commercial routers available for deployment could support 16 OC-48 (2.5 Gbps) interfaces, with a fourfold increase (e.g., to 16 × OC-192) expected to be deployable in the next 1 to 2 years. Terabit and multiterabit routers with a capacity at least six times greater than a 16 × OC-192 router are under development. Despite these increases in capability, routers most likely will continue to limit the bandwidth available between POPs for the foreseeable future. The commercial sector understands the need for faster routers and is addressing it, at least to meet near-term demands for higher link speeds. Additional research on very high speed routers may be justified to provide longer-term improvements in data-carrying capacity.break
Increases in the bandwidth of the Internet backbone alleviate some of the concerns about QOS but may not completely eliminate congestion. Demand for bandwidth is growing quickly, and it appears that ISPs are deploying additional bandwidth just fast enough to keep up. Current traffic measurements indicate that some Internet backbone links are at or near capacity. Factors driving the growth in demand for bandwidth include the increasing number of Internet users, the increasing amount of time the average user spends connected to the Internet, and new applications that are inherently bandwidth-intensive (and that demand other capabilities, such as low latency and enhanced security). Nielsen ratings for June 1999 put the number of active Web users at 65 million for the month and average monthly online time per user at 7.5 hours, up from 57 million users and 7 hours per user just 3 months earlier.5 As an example of increasing bandwidth demands, medical image files that now contain about 250 megabytes (MB) of data are expected to top several gigabytes in the near future as the resolution of digital imaging technology improves.
Internet protocols further limit the capability of ISPs to provide QOS by simply increasing bandwidth. The Transmission Control Protocol (TCP), which underlies most popular Internet applications today, is designed to determine the bandwidth of the slowest or most congested link in the path traversed by a particular message and to attempt to use a fair share of that bottleneck bandwidth. This trait is important to the success of the Internet because it allows many connections to share a congested link in a reasonably fair way. However, it also means that TCP connections always attempt to use as much bandwidth as is available in the network. Thus, if one bottleneck is alleviated by the addition of more bandwidth, TCP will attempt to use more bandwidth, possibly causing congestion on another link. As a result, some congested links are almost always found in a network carrying a large amount of TCP traffic. Adding more capacity in one place causes the congestion to move somewhere else. In many cases, top-tier service ISPs attempt to make sure they have enough capacity so that the congestion occurs in other backbone providers' networks. The only way out of this quandary, apparently, is to provide so much bandwidth throughout the network that applications are unable to use it fully.
Applications that do not use TCP are not the solution, either, because they also tend to consume considerable bandwidth. Such nonadaptive applications are typically those involving real-time interaction, for which TCP is not well suited. Internet telephony is a good example of such an application. Although an individual call might use only a few kilobits per second, many current Internet telephony applications transmit data at a constant rate, regardless of any congestion along their path. Because these applications do not respond to congestion, large-scale deployment can lead to a situation called congestive collapse, in which links are socontinue
overloaded that they become effectively useless. Furthermore, when these applications share links with TCP-based applications, the latter will respond to congestion to the point where they may become unusable. Short of deploying additional bandwidth in the Internet and replacing nonadaptive applications with adaptive ones, the primary approach to addressing this problem is either to equip routers with new mechanisms to prevent congestive collapse or to provide suitable incentives to encourage the development of adaptive applications.
More fundamental factors also limit the utility of increased bandwidth as a means of solving the QOS problem. Adequate bandwidth is a necessary but not sufficient condition for providing QOS. No user can expect to obtain guaranteed bandwidth of 100 Mbps across a 50-Mbps link; similarly, it is not possible to guarantee 10 Mbps each to 1,000 applications that share a common link unless that link has a capacity of at least 10 Gbps.6 The simple fact that Internet backbones are shared resources that carry traffic from a large number of users means that no single user can be guaranteed a particular amount of bandwidth unless dedicated allocation mechanisms are in place. In the absence of QOS mechanisms, it is impossible to ensure that delay-sensitive applications are protected from excessive time lags.7
In theory, ISPs could attempt to provide so much extra bandwidth to the Internet that peak demand could almost always be met and service quality would improve (a technique referred to as overprovisioning, used with some success in local area networks, or LANs). However, research indicates that overprovisioning is an inefficient solution to the QOS problem, especially when bandwidth demands vary widely among different applications, as is the case in health care. Overprovisioning tends not to be cost-effective for leading-edge, high-bandwidth applicationseven those that can adapt to delays in the network. If the objective is to make efficient use of networking resources and provide superior overall service, then mechanisms that enable the network to handle heterogeneous data types appear preferable to the separation of different types of data streams (e.g., real-time video, text, and images) into discrete networks (Shenker, 1995).
A number of efforts are under way in the networking community to develop mechanisms for providing QOS across the Internet. The two main approaches are differentiated services (diff-serv) and integrated services (int-serv). Although they are very different, both attempt to manage available bandwidth to meet customer-specific needs for QOS. Both diff-serv and int-serv will enable greater use of the Internet in some health applications, but it is not clear that these programs will meet all the needs posed by the most challenging health applications.break
Recent efforts in the Internet Engineering Task Force (IETF) have resulted in a set of proposed standards for diff-serv across the Internet (Blake et al., 1998). As the name implies, diff-serv allows ISPs to offer users a range of qualities of service beyond the typical best effort. The ISPs were active in the definition of these standards, and several are expected to deploy some variant of diff-serv in 2000.
Differentiated services do not currently define any mechanisms by which QOS levels could be determined for different communications sessions on demand; rather, initial deployment is likely to be for provisioned QOS that is agreed upon a priori. As a simple example, a customer of an ISP might sign up for premium service at a certain rate, say 128 kilobits per second (kbps). Such a service would allow the customer to send packets into the network at a rate of up to 128 kbps and expect them to receive better service than a best-effort packet would receive. Exactly how much better would be determined by the ISP. If the service were priced appropriately, then the provider might provision enough bandwidth for premium traffic to ensure that loss of a premium packet occurred very rarely, say once per 1 million packets sent. This would provide customers with high assurance that they could send at 128 kbps at any time to any destination within the ISP's network.
Many variations of this basic service are possible. The service description above applies to traffic sent by the customer; it is also possible to provide high assurance of delivery for a customer's inbound traffic. Similarly, an ISP could offer a service that provides low latency. It is likely that providers would offer several grades of service, ranging from the basic best-effort through premium to superpremium, analogous to coach, business, and first class in airline travel. A customer might sign up for several of these services and then choose which packets need which service. For example, e-mail might be marked for best-effort delivery, whereas a video stream might be marked for premium. Customers then would need to develop their own policies to determine which types of traffic flows would be transmitted at different QOS levels.
Although diff-serv is an improvement over best-effort services, it has several limitations that might preclude its use for some health-related applications. First, research has shown that simple diff-serv mechanisms (e.g., those that classify QOS levels at the edge of the network and provide differential loss probabilities in the core) can be used to provide a high probability of meeting users' QOS preferences for point-to-point communications (Clark and Wroclawski, 1997). However, in the absence of significant overprovisioning and explicit signaling to reserve resources, such guarantees are probabilistic, which virtually precludes absolute, quantifi-soft
able service guarantees. The QOS provided by diff-serv depends largely on provisioning of the network to ensure that the resources available for premium services are sufficient to meet the offered load. The provision of sufficient resources to make hard guarantees may be economically feasible only if premium services are significantly more expensive than today's best-effort service. Indeed, ISPs need to have in place some incentive mechanism (such as increased charges for higher-quality service) to ensure that customers attempt to distinguish between their more important and less important traffic.
A second limitation is that diff-serv can be offered most easily across a single ISP's network. Current standards do not define end-to-end services, focusing instead on individual hops between routers in the network. There are no defined mechanisms for providing service guarantees for packets that must traverse the networks of several ISPs. For many service providers, offering diff-serv across their own networks is likely to be a valuable first stepespecially for providers with a national presence that will be able to provide end-to-end service to large customers with sites in major metropolitan areas. Services like these are also valuable over especially congested links, such as the transoceanic links; again, these types of services could be offered by a single provider. Nevertheless, there obviously would be great value in obtaining end-to-end QOS assurance even when the two ends are not connected to the same provider. To some extent, the diff-serv standards have laid the groundwork for interprovider QOS, because packets can be marked in standard ways before crossing a provider boundary. A provider connecting to another provider is in some sense just a customer of that provider. Provider A can buy premium service from provider B and resell that service to the customers of provider A. However, the services that providers offer are not likely to be identical, so the prospect of obtaining predictable end-to-end service from many providers seems considerably less certain than does single-provider QOS.
Third, diff-serv does not currently allow users to signal a request for a particular level of QOS on an as-needed basis (as is possible with the integrated services model, described below). Health care organizations have widely varying needs for bandwidth over time. For example, a small medical center occasionally might need to transmit a mammography study of 100 MB in a short time intervalcreating a need for high bandwidth over that intervalbut it is unlikely to need even close to that amount of bandwidth on average. Thus, a dynamic model of QOS would be preferable. Diff-serv does not preclude such a model; it simply provides a number of QOS building blocks, which could be used to build a dynamic model in the future. A variety of means for dynamically signaling diff-serv QOS are under investigation by networking researchers.break
Finally, the diff-serv approach may not provide a means of differentiating among service levels with sufficient granularity to meet the QOS needs of critical applications, such as remote control of medical monitoring or drug delivery devices. In the interests of scalability, diff-serv sorts traffic into a small number of classes; as a result, the packets from many applications and sites share the same class and can interfere with each other. For example, a physician downloading a medical image could inadvertently disrupt data flows from in-house monitoring equipment if they are on the same network and share a diff-serv class. Although policing of traffic at the edges of the network helps to ensure that applications of the same class do not interfere with each other, it does not completely isolate applications. Stronger isolation, and thus a larger number of classes, may be required for some demanding applications.
In contrast to the diff-serv model, int-serv (Braden et al., 1994) provides quantifiable, end-to-end QOS guarantees for particular data flows (e.g., individual applications) in networks that use the IP.8 The guarantees take the form of "this videoconference from organization A to organization B will receive a minimum of 128 kbps throughput and a maximum of 100 milliseconds end-to-end latency." To accommodate such requests, int-serv includes a signaling mechanism called resource reservation protocol (RSVP) that allows applications to request QOS guarantees (Braden et al., 1997).9 Int-serv provides a service model that in some ways resembles that of the telephone network, in that service is requested as needed. If resources are available to provide the requested service, then the service will be provided; if not, then a negative acknowledgment (equivalent to a busy signal) is returned. For this reason, int-serv already is being used in some smaller networks to reserve bandwidth for voice communications.
Several obstacles stand in the way of the deployment of int-serv across the Internet. The major concern is scalability. As currently defined, every application flow (e.g., a single video call) needs its own reservation, and each reservation requires that a moderate amount of information be stored at every router along the path that will carry the application data. As the nework itself grows and the number of reservations increases, so does the amount of information that must be stored throughout the network.10 The prospect of having to store such information in backbone routers is not attractive to ISPs, for which scalability is a major concern.11
Additional impediments arise from difficulties in administering int-serv reservations that cross the networks of multiple ISPs. Methods are needed for allocating the costs of calls that are transmitted by multiplecontinue
ISPs; new ways of billing users and making settlements among ISPs may be required. These are QOS policy issues, discussed below. It is clear that the management of reservations that traverse multiple ISPs, each with its own administrative and provisioning policies, will be quite complex. Solutions to these problems must address the possibility that one or more ISPs might experience a failure during the life of a reservation, resulting in the need to reroute the traffic significantly (Birman, 1999). Such concerns, if not successfully addressed, could slow or thwart the deployment of int-serv capabilities throughout the Internet.
Alternative Quality of Service Options
Given the difficulties of existing approaches, a promising avenue of research focuses on QOS options that lie somewhere between diff-serv and int-serv. The goal of such approaches is to provide finer granularity and stronger guarantees than are provided by diff-serv while avoiding the scaling and administrative problems associated with int-serv's per-application reservations. One such approach, which is being pursued in the Integrated Services over Specific Link Layers working group of the IETF, combines the end-to-end service definitions and signaling of int-serv with the scalable queuing and classification techniques of diff-serv.12
Another approach, referred to as virtual overlay networks (VONs), uses the Internet to support the creation of isolated networks that would link multiple participants and offer desired levels of QOS, including some security and availability features (Birman, 1999). This approach would require routers to partition packet flows according to tags on the packets called flow identifiers. This process, in effect, allows the router to allocate a predetermined portion of its capabilities to particular tagged flows. Traffic within a tagged flow would compete with other packets on the same VON but not with traffic from other flows. An individual user (e.g., a hospital) could attempt to create multiple VONs to serve different applications so that each network could connect to different end points and offer different levels of service. Substantial research would be required before a VON could be implemented. Among the open questions are how to specify properties of an overlay network, how to dynamically administer resources on routers associated with an overlay network, how to avoid scaling issues as the number of overlays becomes large, and how to rapidly classify large numbers of flows.
Quality of Service Policy
Because QOS typically involves providing improved service to some sets of packets at the expense of others, the deployment of QOS technolo-soft
gies requires a supporting policy infrastructure. In some applications, it is acceptable for QOS assurance to be lost for some short period of time, provided that such lapses occur infrequently. In other applications, however, the QOS guarantee must be met at all times, unless the network has become completely partitioned by failures. Work is needed to develop means of providing solid guarantees of QOS for critical information. Such mechanisms must scale well enough to be deployable in the Internet and will involve matters of policy (whose traffic deserves higher priority) as well as of technology.
In the int-serv environment, QOS policy is required to answer questions such as whether a particular request for resources should be admitted to the network, or whether the request should preempt an existing guarantee. In the former case, a decision to admit a reservation request might be based on some credentials provided by the requesting organization. For example, if a particular health care organization has paid for a certain level of service from its ISP and the request carries a certificate proving that it originates from that organization, then the request is admitted. In the latter case, a request might contain information identifying it as critically important (e.g., urgent patient monitoring information) so that it could preempt, say, a standard telephone call that previously had reserved resources.
It is difficult to predict all the possible scenarios in which policy information might play a role in the allocation of QOS. The RSVP provides a flexible mechanism by which policy-related data (e.g., a certificate identifying a user, institution, or application) can be carried with a request. The Common Open Policy Service protocol has been defined to enable routers processing RSVP requests to exchange policy data with policy servers, which are devices that store policy information, such as the types of request that are allowed from a certain institution and the preemption priority of certain applications. Policy decisions are likely to be complex, because of the nature of health care and the number of stakeholders involved in decision making. Accordingly, the design of policy servers, which are responsible for storing policy data and making policy decisions, would benefit from the input of the health care community.
Policy also has a role in a diff-serv environment. For example, if an institution has an agreement with an ISP that it may transmit packets at a rate of up to 10 Mbps and will receive some sort of premium service, then the question of exactly which packets get treated as premium and which as standard is one of policy. The institution may wish to treat e-mail traffic as standard and use its allocation of premium traffic for more time-critical applications. There may be some cases in which data from the same application will be marked as either premium or standard, depending on other criteria. For example, a mammogram that will not be read bycontinue
the remote radiologist until tomorrow might safely be sent by best-effort service, whereas one that is to be read while the patient remains in the examination room could be sent by premium service. Mechanisms for enforcing policy in a diff-serv environment are currently being defined at the IETF.
Concurrent with efforts to implement QOS mechanisms for the Internet, attempts are under way to deploy multicast capability, which provides a means to make more efficient use of available bandwidth to simultaneously distribute information from one user to a number of specific recipients. Multicast stands in contrast to today's unicast delivery model, in which users communicate on a one-to-one basis, and to the broadcast model of radio and television, in which a single transmitter sends information out to a large number of unspecified recipients. Multicast makes large-scale multiparty conferencing possible in a way that makes efficient use of network bandwidth. It also provides an efficient mechanism for the distribution of streaming media to many recipients concurrently. At the same time, multicast presents new challenges regarding suitable pricing schemes and ways of protecting ISP networks from potential abuse. A related trend is the development of reliable multicast, which attempts to enable the reliable delivery of data from one source to many destinations, even in the presence of occasional packet loss in the network. A typical application is the timely delivery of financial information to many recipients. This technology is receiving a great deal of attention in the research and development communities and is likely to become mature within the next few years.
Both multicast and reliable multicast are likely to be useful in a range of applications, including health. Multicast technologies could be used to provide continuing medical education online through the real-time transmission of lectures over the Internet. It would allow for teleconsultations among geographically dispersed public health officials responding to a perceived public health hazard or bioterrorist attack. Multicast also could facilitate collaborative consultations among physicians, medical specialists, and patients.
Health care applications of multicast may emphasize different design and implementation features than would applications in other domains. For example, users in the health arena may be unlikely to create large multicast groups consisting of a single primary transmitter of information and millions of receivers, an approach more suited to the entertainment industry. Likewise, health organizations might not adopt the model of defense simulations, in which thousands of participants send informationcontinue
to each other on a regular basis to report on their location, speed, and orientation.13 Instead, a health application might involve large numbers of small multicast groups featuring the formation of dynamic memberships to link collaborating physicians. The technical considerations that go into designing multicast protocols that can support large numbers of small groups may differ from those that can support smaller numbers of large groups. Health care applications need to be considered soon to ensure that suitable multicast capabilities are developed.
Security is a top priority for health applications of the Internet. Whenever personal health information is transmitted across the Internet or stored in a device attached to the network, precautions must be taken to ensure that the information is (1) available to those who need it, (2) protected against those lacking proper credentials, and (3) not modifiedeither intentionally or unintentionallyin violation of established policies and procedures. These three requirementsreferred to as availability, confidentiality, and integrityare of concern in most health care applications of the Internet, whether they involve the transfer of personal medical records between health care providers or a provider and a plan administrator, video telemedicine consultations, the reporting of information in a home monitoring situation, or the use of remote equipment in a biomedicine experiment (in which the data may be considered proprietary before formal publication).
Strengthening system security to better protect personal health information entails costs. Computing resources are needed to implement the changes, convenience may be compromised if system users are required to perform added steps (such as typing in additional passwords), and additional employees may be needed to monitor controls and investigate alleged violations of confidentiality. Furthermore, poorly designed security controls may actually impede health care delivery in emergencies by preventing or slowing access to needed information. It is therefore essential that system designers balance the potential costs of security mechanisms against their intended benefitsa process that requires assessing anticipated threats to personal health information, however dynamic and diffuse those threats may be.
To date, malicious attempts to sabotage the availability or integrity of electronic health information have been rare.14 However, the confidentiality of electronic medical records has on some occasions been compromised by individuals such as health care providers or administrators who have legitimate access to some aspect of an electronic record. Indeed, a previous study by the Computer Science and Telecommunications Boardcontinue
(1997a) found that the most significant threats to patient privacy stem not from violations of established confidentiality policies or security practices but from the routine sharing of patient health information among care providers, administrators, public health officials, pharmacy benefits managers, direct marketers, and the likemost of which occurs without a patient's knowledge or consent. The effects of such compromises on the patient or consumer can range from embarrassment to loss of employment or loss of health insurance. In general, such concerns cannot be addressed by the use of security technologies but are the province instead of organizational policies and government legislation that set forth acceptable information-sharing practices.
Nevertheless, security technologies are increasingly important in an Internet environment. Health organizations have tended to rely on trust among health professionals to maintain the confidentiality of personal health information and have favored broad access to information (with some from of review of accesses) over strict controls. Connection to the larger, public Internet will require a new strategy. Health organizations contemplating the Internet as a source of networking infrastructure for critical transactions will need to be assured that they are protected against security risks while making their systems and data available to those who need them. The increasing interconnection of devices that monitor patients and devices that deliver treatment to them increases the possibility and potential consequences of attacks, particularly if the interconnections traverse the Internet. The very act of connecting health information systems to the Internet introduces a number of new vulnerabilities that malicious attackers can exploit. For example, data transmitted across the network can be intercepted and interpreted if it is not encrypted properly. Executable code downloaded from remote sites (or embedded in seemingly innocuous e-mail messages) can alter or destroy data contained in information systems. Denial-of-service attacks can limit system availability.
Elements of Security
To ensure availability, confidentiality, and data integrity, a system requires a number of supporting security functions, including the following:
• Authentication mechanisms to verify that people or systems are who they purport to be;
• Access controls to ensure that authorized entities can access and/or manipulate protected resources only in accordance with a predetermined set of rules and privileges;
• Encryption to protect data by scrambling it so that it cannot be read or interpreted by those without the proper decryption key;break
• Perimeter control to manage interconnections between an organization's internal network and external networks (such as the Internet);
• Attribution/nonrepudiation to ensure that actions taken (such as sending or receiving a message) are reliably traceable and cannot be denied; and
• Resistance to denial-of-service attacks.
These functions are not necessarily performed by the network itself, but the need for them is heightened by the interconnection of information resources to a network, which can expand the scope of potential threats to information systems. In fact, the most appropriate place to perform these functions varies. For example, encryption by end hosts may be used to protect the confidentiality and check the integrity of application data. Other functions, such as authentication of routing updates, need to be supported by network elements such as routers. Security features operate at different network layers (Box 3.1), a variable that affects functionality.break
The implementation of security requires a mix of technological solutions and institutional policies and procedures. Institutional policies define the rules to be enforced; these rules then are allocated to administrative procedures, physical security measures, and technology. For example, organizations cannot simply deploy access control technologies to limit access to online data without first establishing rules to determine which users have the authority to view and alter information and under what conditions. In addition to implementing technologies such as passwords and smart cards to authenticate users, organizations also need to institute procedures for issuing, changing, and revoking passwords, and policies must be in place for disciplining offenders.
Today's Internet provides no security capabilities at the network level. The Internet was originally designed to facilitate information exchanges among mutually trusting entities (such as collaborating researchers), so information is encapsulated into packets that are passed through the network from node to node without encryption. Software programs called ''sniffers" can be run on any node through which packets pass and can scan the contents of a messageeven if the message contains sensitive health information or a user's password. As devices are attached to the Internet, their vulnerabilities tend to become more accessible. For example, weaknesses in operating systems can be exploited quickly over the Internet by individuals who wish to gain unauthorized access to resources. The large installed base of operating systems with known security flaws means that a large number of end points on the Internet are potentially vulnerable to attack.15 These shortcomings will become more significant as the demand for Internet-mediated health transactions grows and as the number of potential users increases.16
Several efforts are under way to improve the security of the Internet. Work continues on firewalls, which attempt to limit and control Internet-based access to an organization's computing resources and, hence, support the objectives of confidentiality, integrity, and access. In addition, new protocols and standards are being developed that will authenticate end points and provide greater confidentiality of information transmitted across the network. These advances promise to provide greatly increased security across the Internet, but work will be needed to ensure their deploymentand the deployment of the complementary capabilities needed to ensure their use. Primary among these complementary tools are certificate authorities (described below), which will help validate the identity of end users and which pose a number of technical and organizational challenges.break
A firewall is a device that isolates one part of a network from another, typically isolating an organization's trusted internal network from an untrusted outside network, such as the Internet. The function of a firewall is to limit access to the organization's network from the outside by allowing only presumably safe traffic to pass through. Firewalls do so using one of four basic designs (Box 3.2) operating at three different networking layers. (This is significant because firewalls cannot prevent attacks launched at layers higher than those at which the firewall operates.) Typically, firewalls block all traffic from outside the organization's network by default and then are configured to allow specific, limited access. For example, a firewall might be configured to allow e-mail messages from outside the corporate network to pass through as long as they are destined to the appropriate mail server. Similarly, a firewall might allow access to an organization's public Web server but not to other Web serverscontinue
that are used for proprietary, internal information. Thus, a company can maintain a public presence on the Internet without opening up its entire network to unlimited access by outsiders.
Despite their popularity, firewalls have several limitations. One problem is that they must be deployed at all places where an organization's network connects to the outside world. Although such networks often are designed to limit such external connectivity to one or two points, it is difficult to prevent unauthorized connectivity at other points. Individual users may establish dial-up modem connections to the Internet without the knowledge of the network administrators, creating a back door into the corporate network. If the network can be compromised at such a point, then firewalls provide no protection. Malicious intruders can gain access to the organization's network and the computers attached to it and may even acquire control of desktop computers.17 In addition, because each of the specific filtering functions of a firewall must be configured, the issue of correctness of configuration is critical; the mere presence of a firewall is not an assurance that any given set of protections has been established effectively.
Even properly configured firewalls in networks without back doors have limitations. First, many firewalls are constructed as applications sitting atop a standard operating system, so they are vulnerable to attacks against that system. A few firewalls use custom-built operating systems that avoid the vulnerabilities inherent in other systems, but these custom systems can introduce new vulnerabilities that have not been detected and remedied through the usual process of widespread use. Second, as noted earlier, a firewall cannot prevent attacks launched at layers higher than those at which the firewall operates. For example, a packet-filtering firewall cannot protect against attacks conveyed as e-mail attachments because it cannot understand attachment types and their risks. Third, firewall function is limited by the use of end-to-end cryptography. Firewalls cannot examine any fields in packets that have been encrypted or modify packets that must be cryptographically authenticated. Hence, many systems tend to decrypt at the firewall or use a layered approach in which some encryption or decryption takes place at the firewall and some inside it.
Fourth, firewalls tend to limit the external electronic communications of those behind the firewall. By filtering out (i.e., blocking) incoming messages that do not meet the specified access provisions, firewalls prevent internal users from receiving messages from some sources. Such a trade-off is inherent in the design of a firewall, which must use some fairly static rules to filter incoming messages. Fifth, firewalls increasingly are challenged by the advent of higher-speed Internet connections, which require the examination of packets at an ever-increasing rate. Sixth, andcontinue
perhaps most importantly, firewalls are effective only against external threats to an organization's data networks; they do not address threats posed by internal users who may intentionally or unintentionally violate security and confidentiality policies. Organizations that focus their security efforts too narrowly on external threats may not adequately protect themselves from insiders. In many industries, including health care, the insider threat historically has been of greater concern.
As the scope and use of Internet applications grow, increased attention has been devoted to the development of protocols for user authentication and protection of messaging traffic. The protocols available today operate at a number of layers in the network. Whereas early protocols used in the military tended to provide link-level security, more recent protocols, such as Internet Protocol Security (IPSec), Transport Layer Security (TLS), and Pretty Good Privacy (PGP), tend to operate at the network, transport, and application levels, respectively. All of these are proposed IETF standards and all can be expected to see more widespread application. All use encryption as the basis for authentication and confidentiality.
Encryption technologies generally are classified as either symmetric key systems (also called private key cryptography) or asymmetric key systems (also called public key cryptography), both of which may be used during the course of a single session between communicating parties. The two technologies differ in a number of respects, including the time it takes to encrypt and decrypt messages and ease of administration. As a result, they tend to be best suited to different types of applications. Symmetric encryption, for example, tends to work better when communicating parties have a preexisting relationship. Asymmetric systems, in contrast, work well between parties that have not communicated before, as in many electronic commerce applications; however, revocation of credentials can be more difficult than with symmetric encryption systems.
Symmetric encryption uses a single key to encrypt and decrypt data. Parties wishing to exchange information securely must ensure that they both have access to the key, meaning that mechanisms must be in place for distributing keys to pairs of users before secure transmissions can begin. This process sometimes involves physically distributing disks containing the key, but this approach is slow and can be used only if the communicating parties can be identified in advance. A number of mecha-soft
nisms, including online key distribution centers,18 have been established to facilitate the electronic distribution of symmetric keys on an as-needed basis. Many such mechanisms rely on asymmetric key cryptography to authenticate parties and distribute keys before they exchange sensitive information. In either case, care must be taken to ensure that keys are not divulged to others, are changed periodically, and are revoked as needed.
Asymmetric (public key) cryptography is an important component of Internet security because it provides a way in which strangers can establish the necessary set of shared information to support authentication and encryption. This means it could be useful in exchanges of patient health records between two unaffiliated hospitals. In an asymmetric key system, a given user (e.g., an individual or corporation) has a pair of keys, one of which is private (known only to the key owner) and one of which is public and may be shared with anyone (or posted in a directory). Data encrypted using the private key can only be decrypted using the public key; and data encrypted with the public key can only be decrypted with the private key. Private keys also can be used to support authentication in the following way: If a recipient can decrypt a message using the sender's public key, then the original message could have come only from the holder of the private key. Conversely, a user wishing to send a message to be read only by its intended recipient can guarantee confidentiality by using the recipient's public key to encrypt the message; the resulting data will be readable only by the owner of the private key. Asymmetric encryption systems require keys about 10 times longer than those of symmetric encryption systems and run considerably more slowly. For this reason, asymmetric cryptography is used only to authenticate the participants in an information exchange and to distribute symmetric keys, which then are used to protect messages exchanged during the remainder of the session.
Distribution of Encryption Keys
A major challenge in using asymmetric cryptography is the distribution of public keys. A person who wishes to use a certain public key for either encryption or authentication needs to know for certain that the key belongs to the appropriate entity; otherwise, authentication is not possible, and encrypted data may be read by an unintended recipient. This problem usually is handled by having some sort of certification authority (CA) issue certificatesdigitally signed documents that state "public key X belongs to entity Y." A certificate can be trusted only if the reader knows the public key for the CA. Thus, if a user obtains a single authoritative public key (that of the CA), any other entity that needs to provecontinue
ownership of a certain public key to that user can do so by providing a certificate for the key.
One difficulty involved in issuing certificates is scale: Large numbers of certificates need to be given to all potential participants in various types of transactions. Health care would benefit greatly from the issuance of certificates to all health care consumersthe entire U.S. population. Such widespread deployment of certificates would enable consumers to gain authenticated access to sensitive health care information (e.g., lab test results) from any provider. Thus, a public key infrastructure (PKI) suitable for health care would need to have the capacity to operate on a scale of hundreds of millions of users. Good initial progress has been made in issuing certificates to Internet vendors of goods and services, but the process has not been extended to individual consumers. One way to make the process more scalable is to arrange CAs in a hierarchy, with the root CA certifying lower-level CAs that issue certificates to even lower-level entities, and so on down to the level at which certificates for individual users are issued. In this case, the process of proving that one is the owner of a certain public key may involve providing a hierarchical chain of certificates that leads from the root down to the owner of the key. The term PKI often is applied to the general problem of distributing keys and certificates to a large population in a scalable way.
The task of building a hierarchy of CAs presents its own challenges.19 Most notable of these is the establishment of consistent policies for issuing certificates. One CA, for example, might give certificates to anyone who requests one by e-mail, whereas another might require recipients to sign an affidavit and provide a birth certificate and passport before providing a certificate. In the system used to support PGP for secure e-mail, any user with a public key can issue a certificate, creating a "web of trust" among particular groups of users rather than a strict hierarchy. PGP also allows users to decide how well they trust a certain certificate based on who it is from and how many corroborating certificates the user holds. In a hierarchical model, if just one CA in the hierarchy uses weak procedures to establish an individual's identity (e.g., issuing certificates to individuals without seeing them in person with positive proof of their identity), then the whole certification system is compromised, because anyone might be able to get a fake certificate from that CA. Such policy differences make it difficult for users and applications to interpret certificates and determine which ones meet their criteria of proof of identity, effectively undermining the goal of enabling broad deployment of asymmetric cryptography. One way to address this issue (proposed in the Privacy Enhanced Mail architecture but not widely deployed) is to have the top-level CA certify the policies used by lower-level CAs, so that weaker policies can be readily identified. However, it is clear that variability of policies among CAs willcontinue
add complexity to the system and is likely to weaken the overall level of trust that can be placed in public keys.
Another challenge is certificate revocation, which may be required if the owner of a public/private key pair believes that the private key has been compromised. Revocation typically is handled by a combination of expiration dates on certificates and revocation lists, which are published lists of certificates that are to be considered invalid, signed by the issuing CA. To be sure that a certificate is valid, therefore, it is essential to have the most up-to-date revocation list from the CA that issued the certificate. The use of expiration dates on all certificates ensures that revocation lists do not grow infinitely large, but it requires all users to undergo periodic recertification, thus increasing the workload on the CAs. An effective PKI must have an efficient means for disseminating up-to-date revocation lists.
The certificate model also raises issues of personal privacy. The CA models developed to date bind a public key to a particular identity, whether an organization or individual. The use of that key, therefore, can be linked to the activities of that organization or individual. Some work has been initiated on key-centric systems, in which names associated with public keys are not bound to a particular individual but have, rather, only local significance for the convenience of users. The Simple Public Key Infrastructure working group of the IETF is attempting to develop an Internet standard that incorporates these ideas, but no related commercial products are available.
Internet Protocol Security
Internet Protocol Security is an architecture and set of standards that provides a variety of services, such as encryption and authentication of IP packets, at the network layer (Kent and Atkinson, 1998a,b,c). IPSec can protect traffic across any LAN or wide-area-network technology and can be terminated at end systems or security gateways (e.g., firewalls). Now being standardized at the IETF, IPSec has been deployed initially in virtual private networks (VPNs) that use the Internet as the underlying medium but establish an encrypted tunnel across it. An encrypted tunnel can be created between a pair of IPSec gateways, which might be located, for example, at two geographically separated offices of a single company. Each gateway encrypts the data and sends them to the other gateway. The receiving gateway then decrypts the data before passing them on to the final recipient at the site. Because the data are encrypted using a key known only to the two gateways, the message cannot be read by anyone else while crossing the Internet. Furthermore, the receiving gateway can authenticate the data as having come from the sending site and not somecontinue
other source on the Internet. An attractive feature of this technology is that a configuration of one device at each site can protect traffic between those sites against eavesdropping and corruption. Protection does not extend beyond the ends of the tunnel.
Virtual private networks have several limitations. First, VPNs based on tunnels do not scale well. Increasing the number of participants in a VPN increases the number of points at which the system can be compromised and makes the process of key management much more difficult. Second, IPSec tunnels frequently require a priori knowledge of where connectivity will be required, because the gateways must be configured with appropriate keys (in the absence of an automated key management infrastructure) and routing information may need to be modified to force traffic to use the appropriate tunnel. This characteristic makes IPSec a viable alternative for information exchanges between organizations with well-established relationships but less effective for unexpected or transitory exchanges of information. In health care, VPNs might be useful for secure communications among elements of an integrated delivery system or between health care providers and the Health Care Financing Administration (HCFA), which processes Medicare claims, but they could not readily support exchanges of patient records between unaffiliated hospitals in an emergency situation.
Third, IPSec tunnels do not necessarily protect data during the entire transit between sender and receiver. Many enterprises encrypt data only between VPN gateways, which are devices that sit at the boundary between the public Internet and the corporate network and encrypt data traversing the Internet. This configuration simplifies the key management problem because it requires encryption keys for the gateways onlynot for all the computers connected to themand averts the need to modify end users' machines. At the same time, it leaves data unencrypted as they pass between the users' computers and their respective gateways, meaning that someone with physical access to the data lines could, in theory, intercept and read the messages. Some users note that, in practice, the expectation of security in a VPN can lull them into failing to encrypt data passed across it. Often, unencrypted data are sent over a LAN until reaching a VPN gateway; in other cases, as with frame relay, the data are not encrypted and are subject to misrouting.
Transport Layer Security
An alternative mechanism for providing encryption and authentication across the Internet is transport layer security, which is widely used across the Internet in the form of the Secure Socket Layer (SSL) system (Dierks and Allen, 1999). This technology is widely used for transmittingcontinue
sensitive information (such as credit card numbers) between Web browsers and servers. Transport layer security uses asymmetric encryption to authenticate the server (and, optionally, the client) and symmetric encryption to protect communications between the end user and the Web site. An organization that requires encryption for transactions processed through its Web site obtains a certificate from a CA (e.g., Verisign, CyberTrust, CertCo, or DST). When a user connects to the organization's Web site, the site provides its certificate. Modern Web browsers come equipped with the public keys of the major CAs so that the browser can verify the public key of the Web site and thereby authenticate it.
Secure Socket Layer can support encryption in both directions (to and from the Web site), but as commonly used today it provides authentication of only the host organization's Web site.20 The bidirectional authentication function is generally not invoked; it would require the client (as well as the server) to have a certificate, which is not generally the case. As a result, users can readily verify the identity of the organization with which they are communicating, but the server site typically cannot verify the identity of the person using asymmetric encryption techniques. Existing transport layer security, therefore, has been used for credit card transactions in which authentication of the user is not performed cryptographically but rather by some other means (e.g., verification of card number, expiration date, and billing address or a name and password) in which the misidentification of a client has a known cost (e.g., unrecoverable accounts receivable). In the absence of client certificates, SSL is not well-suited to applications in which the costs of misrepresentation of identity cannot be quantified and in which secure passwords are not considered sufficient for authentication. In the health domain, such applications might include the delivery of health care via the Internet or real-time patient monitoring.
Although an individual can obtain a certificate in much the same way that a corporation running a Web site does, the issuance of certificates to many individuals requires methods that are more scalable than those available today and that ensure greater compatibility in the criteria used to issue them.21 Organizations that maintain a presence on the Internet generally find it possible to obtain a certified public key from one of the small number of CAs, but there is no infrastructure in place to enable individuals to obtain certificates on a large scale. Such problems are likely to inhibit the use of applications in which authentication of the consumer is as important as authentication of the vendor.
Several initiatives are under way to deploy CAs for health applications. In October 1999, Intel announced that it would work with the American Medical Association to provide digital certificates to physicians in the hope of enabling doctors to transmit information such as test resultscontinue
to patients and other health care workers. Healtheon/WebMD has agreed to provide the product to physicians, while other health Web sites, including WellMed and Franklin Health, intend to provide access to the product on the consumer side (Reuters, 1999).22 The Robert Wood Johnson Foundation has also awarded a $2.5 million grant for a five-state HealthKey initiative that will explore ways to facilitate electronic exchanges of information among companies in the health sector while protecting the confidentiality of the data.23 PKI is one of the solutions being considered.
The financial industry has found SSL suitable for many of its consumer-oriented activities, such as online banking and stock trading, but the limitations of this system in the health domain are apparent. At present, financial institutions rely primarily on certificates for server authentication and on passwords for client authentication. As users obtain more online accounts, they need more passwords. To help themselves remember all these passwords, users take many steps that reduce security, choosing passwords that are easy to recall (and easy for others to guess), reusing passwords already in place for other accounts (thus making all of their accounts vulnerable to the compromise of a single password), and writing down passwords (making them easier for others to find). Furthermore, financial organizations usually issue passwords by mailing them to the address of record for the account. This process introduces significant delay in password assignment, making the process inappropriate for health applications in which an emergency room physician may need access to a patient record at a remote hospital. The distribution process is also limited in that a mailed password can easily be intercepted by the wrong member of a householda vulnerability that may have more serious consequences with health data than with financial information. Hence, the trade-offs that are acceptable for applications in the financial sector do not seem suitable for health care.
Software to encrypt e-mail has been available for many years, but its use has been limited because, until recently, it had not been integrated well into standard e-mail applications. PGP is a well-developed collection of encryption software that is commercially supported and available free of charge for noncommercial use (Zimmerman, 1994). Although it supports a variety of functions, it is used most often to digitally sign and/or encrypt e-mail. To send an encrypted e-mail message, the sender must gain access to the public keys held by the intended recipient(s), who must have obtained PGP public/private key pairs in advance.24 An e-mail message to be sent to several recipients is encrypted using a symmetric algorithm with a new (secret) session key. This key is then encryptedcontinue
using the public key of each of the recipients, and the results of this encryption are included in the header of the transmitted message. Software run by each recipient uses the recipient's private key to retrieve the session key.
The emerging standard for commercial e-mail systems is Secure/Multipurpose Internet Mail Extension (S/MIME). MIME is an extension to standard e-mail formats that supports the transmission of data, such as video and audio, that are not usually represented as ASCII text. S/MIME supports the transmission of signed and/or encrypted data in e-mail messages in much the same way that PGP does. However, S/MIME uses certificates based on an international standard called X.509 version 3 and generally embraces the hierarchical model for CAs, as opposed to PGP's webs of trust.
Access controls are an important element of computer systems security. They consist of a range of techniques for controlling the capability of active entities (such as computers, processes, or users) to use passive entities (such as computers, files, directories, or memory). For example, access controls can prevent certain users from viewing a particular database, modifying information that they can view, or running certain programs and functions. Because they operate at the level of bits, access controls can be used to permit users to access portions of an encrypted file while still protecting the overall confidentiality of the information. Access controls can operate at virtually all layers in a networked applicationfrom the physical layers defining the communications medium itself through the application layer consisting of software programsand can extend access privileges based on various characteristics, such as the user's identity or role in the organization (Box 3.3).
Health care poses an especially difficult challenge with respect to access controls because of the difficulty of balancing the need to protect confidential information from unnecessary dissemination against the need to ensure adequate access to enable the provision of care. It is often difficult in a clinical setting to determine a priori who should or should not be granted access to a particular patient's medical record. Many clinicians may interact with a patient during the course of treatment, and all are likely to need access to the medical record. During a typical hospital stay, numerous health care workers, including dieticians and pharmacists who must consider potential drug and food interactions, will need access to the record. A study by the Institute of Medicine (1997) identified 33 different types of individual users of patient medical recordsincluding care providers, health plan administrators, researchers, educators,continue
accreditation boards, and policy makersand 34 representative types of institutions. Each of these users needs different information, and their access privileges could be markedly different, compounding the difficulty of developing effective access controls and confidentiality policies (upon which access controls are based).break
Also important are tools that support rapid access for authorized individuals. In some cases, military or commercial secrets may require protection over a span of decades or more (e.g., the formula for Coca-Cola or technical details of nuclear weapons). Patient records can contain data equally sensitive from the standpoint of the individuals concerned, requiring lifetime protection, yet they also must be available in emergency situations to a much larger class of people. Furthermore, many different users may have occasional permission to view or modify portions of records. For example, a major public benefit of increased automation of health care information systems should be the availability of the data for research studies. Unless the subjects of the studies can be assured that their information will be protected, through either the clear enforcement of appropriate release policies or the ''anonymization" of the records, this benefit will be limited.
Because they need to ensure adequate access to health information in emergencies, many health care organizations either (1) routinely give physicians access to information on all patients within the organization's care or (2) routinely give them access to information on only those patients directly under the physician's care but provide emergency overrides to allow access to the records of other physicians' patients if needed. The organizations then rely on the use of audit trails to review accesses after the fact. Such audits are intended to deter the abuse of access privileges, but their effectiveness depends on the availability of tools for automatically detecting anomalous accesses to medical records. Some work is under way on such tools, but they are still in their infancy. More work is needed to develop sophisticated audit analysis tools that take into account expected usage patterns and auxiliary information, such as appointment schedules and referral orders, to more accurately identify potential violations of confidentiality policies.
The introduction of networking (e.g., the Internet) compounds the access control problem by facilitating the exchange of electronic medical records among different users of health information. Standard access controls can limit the use of information within a single organization but cannot control how information is used once transferred to another organization. Some work is under way to develop technologies that could control such secondary uses. Cryptographic envelopes and associated rights-management languages, such as those developed by IBM, Xerox, and InterTrust Technologies, enable content owners to send data in an encrypted form to users outside their organizations and to specify the actions that different users can undertake with protected data, perhaps allowing them to view it, for example, but not to print or redistribute it. Such tools may also allow auditing accesses of health records across institutional boundaries. Cryptographic envelopes are still relatively new andcontinue
were developed primarily to protect copyrighted material and ensure proper payment for viewing or copying it. Additional work is needed to extend this model into the health sector and develop rules for sharing information among different types of health care organizations.
Network availability is another essential element of information systems security.25 Availability is the probability that the network (i.e., the Internet) will be operational at a particular point in time and accessible to those who need it. High availability is a key requirement for mission-critical and time-critical applications of the Internet, including many in health care. If the availability of the Internet is uncertain, then health care providers cannot rely on it for the provision of remote patient care or access to electronic medical records in the emergency room, although they may still be able to use it (with some degree of frustration) to submit bills and allow consumers to select physicians.
Network availability can be compromised by a number of factors, including hardware or software failures, operator errors, malicious attacks, or environmental distruptions (e.g., lightning strikes, backhoes cutting fiber-optic cable) that cause particular links or entire sectors of the network to fail. Availability is closely related to both QOS and security in that the failure of a link connecting two routers across the Internet can affect the capability of an ISP to meet its QOS guarantees, and security measures that protect a network from malicious attacks (whether actual network intrusions or denial-of service attacks) can help ensure its availability. In addition, security measures such as ensuring software correctness and ensuring software integrity (avoiding viruses, worms, Trojan horses, etc.) will address some issues of availability. However, such mechanisms do not necessarily protect a network from accidental operator errors or physical damage or ensure network survival in the face of hostile attacks.
By virtue of its design, the Internet is reasonably resistant to many forms of failure. Its web-like interconnections among routers ensure the existence of multiple routes for channeling messages across the network. If one link fails, then traffic can be routed along an alternative pathway. In most cases, the network can converge on a new path that avoids the failed link within a matter of seconds, providing sufficient reliability for many, if not most, Internet applications. Nevertheless, service outages do occur. ISPs and Web hosting facilities operate their network infrastructures with cutting-edge technology but, despite the adequacy of the equipment and redundant fiber links, outages lasting for hours occur several times a year. The causes vary. Faulty routers can announce incorrect routes andcontinue
cause disturbances to propagate throughout a provider's network, for example, or an upgrade can cause unexpected problems when a network is large, despite extensive testing.
To counteract these problems, many end-user organizations maintain redundant links from different ISPs. To do so, an organization needs to run its own Internet routers and announce and manage its own routes across the Internet. The management overhead involved in deciding how to balance traffic between the links and when to switch all traffic to one link is significant and error-prone. Indeed, many of the difficulties that arise today are directly related to the complexity of this problem. The number of routes, the size of the resulting data structures, the inherently distributed nature of routing algorithms, and the constraints applied by administrative/business requirements all contribute to this complexity. A strong research effort is required to improve the reliability and performance of Internet routing protocols.
Additional efforts are needed to consider means of responding to disaster scenarios in which large portions of the network fail, resulting in major outages. Such disasters could be confined to the network, in which case mechanisms are needed for ensuring transmission of a variety of network traffic, or they could be more widespreadfires, earthquakes, or stormsand thus might call for ways of mobilizing health care resources despite widespread network outages. In both cases, mechanisms are needed to ensure adequate network availability for mission-critical applications and to handle high-priority traffic. Many policy issues would need to be addressed to help balance the networking needs of health care organizations against those of other critical communications. Additional work is needed in the area of survivability to ensure that the network can maintain or restore an acceptable level of performance during failure conditions by applying various restoration techniques.
Some work is under way in the Department of Defense to develop techniques for prioritizing network traffic in case of degradations that limit network capacity, but such work may need to be extended to consider the requirements of health care. As an example, in disaster situations, telephone service providers can block incoming calls to the affected region at their source (by providing a busy signal) so that limited link capacity can be used for more urgent outgoing calls. Congestion control on the Internet is problematic because no mechanisms exist to manage traffic based on user, connection, source, or destination. Routers do not store information about users and connections, and doing so would require significant memory and management mechanisms and would also raise a host of privacy concerns that would need to be addressed. Some newer algorithms for congestion control have been designed to work at the IPcontinue
level, but more research is needed, especially in the area of defining and enforcing flexible and varied policies for congestion control (CSTB, 1999).
Broadband Technologies for the Local Loop
Before the health community and health care consumers can benefit from future Internet applications, they must gain access to sufficient bandwidth in their local connections to ISPs to handle the anticipated traffic loads. Health care organizations that intend to transmit detailed radiographic images to remote specialists for near-real-time interpretation, for example, will need Internet connections capable of transmitting hundreds of kilobits per second, if not megabits per second. Biomedical research institutes conducting distributed simulations will also need high-bandwidth connections. Organizations will meet many of these needs by leasing communications lines with the needed capacity. Alternatively, some organizations that provide content over the Internet and expect high demand for their services may attempt to offload some of their functions to third parties that can acquire the needed capacity, although there may be limitations to this model in health applications (see Box 3.4).break
Many businesses already connect to the Internet over dedicated lines at speeds ranging from 1.5 Mbps (T1 lines) to 155 Mbps (OC-3 access). The various possibilities are listed in Table 3.1. A few organizations lease OC-12 lines capable of transmitting 622 Mbps, but these tend to be limited to a few high-profile Web sites that expect large amounts of traffic. Alternatively, large organizations can subscribe to services that provide high-bandwidth access over a shared medium. Frame relay, for example, is a packet-switched connection typically sold at speeds ranging from 56 kbps to 1.5 Mbps; packets from various subscribers are mixed together across the underlying links. Frame relay is less expensive than a dedicated line but introduces some uncertainty about instantaneous capacity as other organizations contend for bandwidth. Asynchronous transfer mode (ATM) is another service that is starting to supplement frame relay as a switched alternative to leased lines. In addition to offering higher speeds (e.g., 155 Mbps), ATM offers a wider range of QOS options than frame relay does and was designed to handle mixed voice, data, and video.
Although leased lines, frame relay, and ATM are viable alternatives for a variety of institutional users, they are generally too expensive for residential users and small businesses (such as private practioners). Leased lines can cost from hundreds to thousands of dollars per month, depending on the bandwidth provided, and a 56-kbps frame relay connection can cost $150 per month after installation. The overwhelming majority of residential users today connect to the Internet using a modem connected to a conventional telephone line. Such connectivity is almost universally availableaccessible from any location with a telephone linebut is limited in bandwidth. The fastest modem connections today are capable of providing 56 kbps, but many residential users connect atcontinue
28.8 kbps or less. These low connection speeds are suitable for many of today's health applications, such as downloading online information and participating in chat groups, but they can be bottlenecks for large files (whether text, video, or audio) and real-time video.
Future health applications of the Internet may demand more bandwidth to residential and small institutional users than conventional modems operating over telephone lines can provide. For example, if online health records become more widely used and begin to contain more medical imagery (e.g., X rays, CT scans, MRIs), then greater bandwidth will be needed to download the images quickly. If rural health clinics begin sending radiographic images to remote specialists for near-real-time interpretation, then they will need significant bandwidth. More significantly, applications such as video-based teleconsultations and teleradiology will require as much bandwidth out of the home or small office (upstream to the ISP) as into it (downstream into the home or office), because images will be transmitted in both directions.
This requirement for symmetry in upstream and downstream bandwidth allocation represents a significant shift from most current consumer Internet applications, which assume the majority of information will flow from the Internet to the consumer. Two of the more popular technologies currently available for broadband access in the local loopmodems using cable television lines and digital subscriber line (DSL) technologies (Box 3.5)allocate bandwidth asymmetrically, providing more bandwidth downstream than up (Table 3.2).26 Cable modems, for example, enable users to receive data at speeds as high as 10 Mbps but transmit data at only 384 kbps. Bandwidth is shared with neighboring residences (up to several hundred), so the exact amount of bandwidth available to an individual at a given point in time depends on his or her level of activity.27 Most deployments of DSL technology support up to 1.5 Mbps downstream and 768 kbps upstream. Unless high-quality videoconferencing or distributed simulation games become popular with consumers and drive the need for downstream video, the health sector could become a driver for this capability.
It is possible to reconfigure DSL symmetries with a technology available today, discrete multitone (DMT). But this can be done only if an upcoming standards decision for very-high-speed DSL (known as VDSL) favors the DMT approach. Some VDSL technologies provide data speeds up to 26 Mbps downstream and 3 Mbps upstream over copper wire at distances of up to 3,000 ft. The use of DMT makes it possible to achieve more symmetric bandwidth allocations of up to 10 Mbps in each direction at distances of up to 5,000 ft. If this approach is not standardized, then asymmetric technologies will be locked in much more strongly and diffi-soft
cult to dislodge in the future. Many companies have strong vested interests in existing asymmetric technologies and might resist the use of DMT.
Another drawback to cable and DSL networks is that they are not currently accessible from all locations within the United States. Cable systems pass through approximately 80 million U.S. homes but tend to be concentrated in densely populated, not rural, areas. They also tend to pass through residential neighborhoods instead of business districts (an artifact of the focus on entertainment applications), a pattern that may impede the use of these technologies by some health facilities. In addition, some older cable networks cannot support cable modems. The deployment of cable modem service is accelerating, but cable companies are expected to focus on upgrading their infrastructure in densely populated areas, where the greatest revenue can be realized from high-speed data services.
The availability of DSL services and the amount of accessible bandwidth are highly sensitive to the distance of a residence from the central office and to the quality of the copper wiring. Asymmetric DSL (ADSL) services typically can support data rates up to 1.5 Mbps downstream and 384 kbps upstream over twisted pair up to 18,000 ft, which would reach almost 80 percent of U.S. households, according to the ADSL Forum. For example, one DSL provider offers services at 384 and 768 kbps in metropolitan areas only and expects to be able to reach 65 percent of the residences in those areas. The remaining 35 percent of residences would not be accessible with the current technology because of either their distance from the central office or the low quality of the telephone lines. Deployment in remote or impoverished areas is not likely to proceed quickly.
Another means of Internet access is wireless technologies, which (at least theoretically) could be used virtually anywhere and also ease the provider's burden of laying down wires, fibers, and cable. Low-speed wireless services (e.g., approximately 30 kbps) are currently available in only a few parts of the country but are likely to become more widespread. High-speed wireless is also likely to become an alternative for connecting to the Internet, initially for businesses but perhaps also for consumers who are not well served by cable or DSL. Local multipoint distribution system (LMDS) technology uses high-frequency microwaves for two-way communications at data rates of up to 155 Mbps. It operates within areas (or cells) 2 to 5 miles in diameter. Performance is limited by rain and by the need to maintain a line of sight between the transmitting and receiving stations.
Satellite-based systems using either geostationary or low Earth orbit (LEO) satellites boast maximum transmission speeds twice as fast as LMDS, 3 to 6 times faster than cable, and up to 12 times faster than DSL. Such systems are likely to cost hundreds of dollars per month for servicecontinue
plus $500 to $1,000 for the antenna (Skoro, 1999). Geosynchronous systems are limited by significant propagation delays (i.e., latency), which may preclude their use in some interactive applications; furthermore, their data-carrying capacity is distributed among a large number of users. LEO systems overcome some of the problems with delay, but the satellites move fast and have smaller coverage areas, meaning that large numbers of satellites are needed to provide global coverage and techniques are needed to manage the handoff of connections between satellites. High-power transmitters are needed to achieve high data rates, which implies large antennas and/or high-frequency operation. At higher frequencies, signals degrade more quickly in rain and other adverse weather conditions.
Overall, the deployment of broadband Internet services has been slow, albeit increasing, in the United States. Only about 1 percent of all U.S. households with Internet access had broadband connections in 1999 (Clark, 1999). A number of factors are at play, including technology, economics (both the cost of building broadband networks and the costs of service), and policy. Most U.S. households have yet to subscribe to broadband services because these connections are not offered in their geographic area, or because they are too expensive or not viewed as useful. The spotty coverage and high cost of high-bandwidth access technologies mean, unfortunately, that those who could benefit most from the health care applications of the Internetsuch as people in rural areas with limited access to medical specialistsare the least likely to have high-speed Internet access.
Work in many areas, both technical and policy-related, will be required to enhance network access for health applications. In some cases, technical work will be pursued by the computing and communications industries without the participation of the health community. Even so, by voicing its needs, the health community will help ensure that they are met. In other cases, the requirements of the health care community may motivate research. Again, the articulation of specific needs will be necessary, and participation in research may be needed as well. The following section identifies several needs that are of particular interest to the health care community.
A popular cartoon depicts a dog sitting in front of a computer monitor and is captioned, "On the Internet, nobody knows you're a dog." At one level, this statement is truean ordinary Internet user can choose cryptic pseudonym or screen name so that a typical e-mail recipient or chat room participant cannot easily identify the individual behind thecontinue
name. But, unless users encrypt e-mail before sending it, every router that forwards the message will be able to read it. Even if the message is encrypted, each router in its path knows the network address from which it was sent and its destination. The user's ISP knows the name and address of the individual who is paying for the service. If the user sends the message from a workplace, then the employer has the right to read it; even a free, public access system is not entirely safe because others may be looking over the user's shoulder. If the user browses the Web, then the Web server reached will very likely be able to learn a lot about the user's computer system, including the make, operating system, and browser. Accordingly, a user querying a database for information on a sensitive disease or condition might wish to take precautions.
There are powerful incentives for Web servers to monitor their visitors, because the data extracted have commercial valuethey allow businesses to know which parts of their Web site are interesting to which visitors, thus supporting targeted advertising. Consumers may benefit from such advertising because they learn of new products in a manner that coincides with their tastes, but the implied lack of privacy can be a deterrent to the use of the Internet in certain health care applications. Patients express considerable concern about health information. To protect their privacy, some patients withhold information from their care providers, pay their own health expenses (rather than submit claims to an insurance company), visit multiple care providers to prevent the development of a complete health record, or sometimes even avoid seeking care (Health Privacy Working Group, 1999). The Internet may ease some such concerns because it enables consumers to find health information without visiting their care providers, and it may eventually allow them to seek consultation from, or be examined by, multiple providers in different parts of the country. But without additional privacy protections, a host of new companies could collect information about personal health interests from consumers who browse the Web, exchange e-mail with providers, or purchase health products online. Profiles of patients' online activities can divulge considerable information about personal health concerns. Patients have little control over how that information might be usedor to whom it may be sold.
Concerns about anonymity extend beyond consumer uses of the Internet. Care providers and pharmaceutical researchers, too, express concerns about the privacy of their Internet use. Some care providers wonder if their use of the Internet to research diagnostic information might be construed as a lack of knowledge in certain areas. If such information were tracked byor made available toemployers or consumer groups, then it could hurt providers' practices. Pharmaceutical companies are concerned that the use of online databases by their researcherscontinue
may divulge secrets about the company's proprietary research. These concerns can be addressed in a number of ways, both technical and policy-oriented, but they need to be put to rest if the Internet is to be used more pervasively for health applications.
Some mechanisms are available that users can exploit to reduce their exposure to prying eyes on the Internet. Most attempt to protect the anonymity of users, so that the sender of a message or a visitor to a Web site cannot be identified by the recipient or the Web server. Encryption is the basic engine that underlies all of these mechanisms. Until now, most research on anonymous communication has been carried out informally and without specific attention to health care applications. Most of the existing mechanisms were designed and built in the context of the Internet, and the future development of Internet infrastructure may be intertwined with their use. The benefits and dangers of supporting anonymous communication mechanisms have been the subjects of recurrent (and appropriate) discussions.
Health care offers one of the most compelling cases for the benefits. Such mechanisms could make people feel safe in seeking out information about their own health problems, thereby leading to earlier diagnosis and better treatment. They could also be used to solicit reports about the spread of, for example, sexually transmitted diseases and other health problems that individuals may prefer to report anonymously. In addition, anonymous communication mechanisms can help users limit the capabilities of others to build databases of their behavior or can reduce the extent to which they are the targets of undesired commercial solicitations. For all of these reasons, it would be appropriate for the funders of health care research to support investigations into anonymous communication technologies for future Internet architectures.
Encryption of e-mail can prevent intermediaries in the network from reading the messages but cannot prevent them from knowing that the sender and receiver are communicating; likewise, it cannot necessarily hide the identity of the sender from the receiver.28 The first mechanisms developed to support anonymous e-mail messages were called re-mailers. These mechanisms would permit a user to register a pseudonym with the server. Mail coming from the user would then be re-mailed by the server, which would strip out identifying material in headers and make it appear that the mail originated at the re-mailer. The re-mailer could forward replies sent to ''pseudonym@remailer" to the registered user. For additional protection, the user could encrypt traffic sent to the re-mailer, so that a wiretapper with connections to the re-mailer's inputs and outputscontinue
could not easily defeat the mechanism. The wiretapper still could probably identify which user was sending mail to which destination by looking at the timing and lengths of messages sent and forwarded by the re-mailer.
A single re-mailer remains a point of trust and vulnerability, because it knows the mapping between identities and pseudonyms. This vulnerability has been exploited in legal attacks: for instance, the operator of a widely used Finnish re-mailer dis operations when he found that, under Finnish law, he could be forced to reveal the identities of his subscribers. Some U.S. companies have revealed pseudonym-identity mappings when subpoenaed in civil cases.
To provide stronger protection, David Chaum (1981) proposed a network of re-mailers, called mixes. In this scheme, each e-mail message traverses a sequence of mixes and then is reencrypted for transit across each link. In addition, each mix collects a set of messages over a period of time and recorders the set before forwarding them, so that even an observer who could trace the sequence of arrivals and departures from all mixes would be unable to trace a message through the network. Ad hoc networks of re-mailers that incorporate some of these approaches are now operating on the Internet. A commercial service, Anonymizer.com, provides an anonymous re-mailing facility that permits a sender, free of charge, to specify a chain of re-mailers.
Protected Web Browsing
Because forwarding of e-mail does not require a real-time connection from sender to receiver, it is reasonably easy to protect sender anonymity, at least partially. Web browsing, because it depends on a reasonably prompt interaction between client and server, is more difficult to protect. The timing of message arrival and departure may make it obvious to an observer that two parties are communicating, even if the message contents and addresses are obscured. The problem of how to hide the identity of a user browsing the Web from a server that it accesses can be broken into two parts: first, how to prevent an eavesdropper from being able to trace the path of the traffic and, second, how to prevent the server from sending traffic over the path that causes the client (against the user's wishes) to reveal information that could identify the user.29 Most of the techniques developed for protecting Web browsing have been, or could be, adapted to support anonymous e-mail and other functions (e.g., file transfer, news, VPNs) as well.
A straightforward approach to providing anonymous Web browsing uses a trusted intermediary, analogous to a simple re-mailer. The user forwards the universal resource locator (URL) of interest to the intermediary, which strips any identifying information from the requests, perhapscontinue
even providing an alias, and forwards the request to the intended server. To the server, the request appears to have come from the intermediary.30 The intermediary also forwards any data returned to the appropriate requester. Anonymizer.com, the Rewebber (formerly Janus), and Proxymate (formerly Lucent Personal Web Assistant) all provide services of this sort. The communication between the client and the trusted intermediary can be protected from simple eavesdropping by using SSL over this link. The Rewebber also supports anonymous publishing by providing encrypted URLs. A user wishing to retrieve data from an anonymous server obtains an encrypted URL for that server (this encrypted version may be freely distributed). The Rewebber then decrypts the URL, forwards the request to the hidden server, collects the reply, and returns it to the user.
Technology to hide the communication path, based on an enhancement of Chaum's Mix networks, has been developed and prototyped by the Naval Research Laboratory in its Onion Routing Project (Reed et al., 1998). This scheme creates a bidirectional, real-time connection from client to server by initiating a sequence of application-layer connections within a set of nodes acting as mixes. The path through the network is defined by an "onion" (a layered, multiply-encrypted data structure) that is created by the user initiating the connection and transmitted to the network. Only the onion's creator knows the complete path; each node in the path can determine only its predecessor and successor, so an attack on the node operators will be difficult to execute. This strategy also limits the damage that a compromised onion routing node can do; as long as either the first or last node in the path is trustworthy, then it is difficult for an attacker to reconstruct the path. All the packets in the network have a fixed length and are mixed and re-encrypted on each hop. In the event that the submitted traffic rate is too low to assure adequate protection, padding (dummy packets) is introduced. These defenses can be expected to make it extremely difficult to use traffic analysis to deduce who is talking to whom, even if an eavesdropper can see all links.
Onion routing needs a separate screening mechanism to anonymize the data flowing between client and server, so that the server is blocked from sending messages to the client that will cause client software to reveal its identity. Although the Onion Routing Project has implemented an anonymizing proxy to perform this type of blocking, a server can play any of an increasing number of tricks to determine the client's identity. Other projects, such as Proxymate, have specifically concentrated on hiding the identify of the client from the server and have devised more robust techniques for doing so than those developed under the Onion Routing project (Bleichenbacher et al., 1998). These techniques can be combined with onion routing to provide strong protection against both traffic analysis and servers that might try to identify their clients. A system forcontinue
protecting personal identity on the Web that appears to be closely based on onion routing is being offered commercially by Zero Knowledge Systems (1998) of Montreal.
A different approach has been prototyped by AT&T researchers in their Crowds system (Reiter and Rubin, 1998). Instead of creating a separate network of mixes to forward traffic, each participant in a Crowd runs a piece of software (called a jondo) that forwards traffic either to other nodes in the Crowd or to its final destination. In effect, when a member of a Crowd receives a packet, it flips a weighted coin. If the coin comes up heads, then the participant decrypts the packet and sends it directly to its Internet destination address. Otherwise, it forwards the packet to another randomly chosen jondo. The Web server receiving the packet can only identify the jondo that last forwarded the packet; it cannot deduce the packet's true origin. Return traffic follows the same randomly generated path in the reverse direction.
In the physical world, individuals who do not want stores to track their purchases can pay cash. The standard approach for buying items on the Internet, however, is to use a credit card, which is guaranteed to reveal the purchaser's identity. Several schemes based on cryptographic mechanisms can enable anonymous payment over the Internet. Chaum (1989) pioneered research in this field, but the rise of e-commerce has triggered much additional work in recent years. The basic idea is to create the electronic analog of a coina special number. The merchant must be able to determine that the coin is valid (not counterfeit) without requiring the identity of the individual presenting it. Because computers can copy numbers so easily, a basic problem is to prevent a coin from being spent twice. Although Chaum and others have invented schemes that solve this problem and yet provide anonymity (at least for users who do not try to commit fraud), it has proven difficult to transfer these solutions into the world of commerce. Law enforcement authorities express concern over such technologies because of the potential for their use (or misuse) in money laundering and tax evasion.
Anonymous Data Released from Sensitive Databases
For many years, the U.S. Bureau of the Census has been charged with releasing statistically valid data drawn from census forms without permitting individual identities to be inferred. The problem of constructing a statistical database that can protect individual identities has long been known to researchers (Denning et al., 1979; Schlörer, 1981; Denning andcontinue
Schlörer, 1983). To limit the possibility of identification, statisticians have developed several techniques, such as restructuring tables so that no cells contain very small numbers of individuals and perturbing individual data records so that statistical properties are preserved but individual records no longer reflect specific individuals (Cox, 1988). Medical records have often been disclosed to researchers under the constraint that the results of the research not violate patient confidentiality, and, in general, researchers have lived up to this requirement.
Recently, researchers have shown how easily even data stripped of obvious identifying information (name, address, social security number, telephone number) may still disclose individual identity, and they have proposed both technical approaches to reduce the chance of confidentiality compromises and guidelines for future release policies (Sweeney, 1998). The benefits of having full access to relevant data for research purposes and the difficulty of rendering data anonymous without distorting it are likely to require a continuing trust between researcher and subject. An earlier report by the Computer Science and Telecommunications Board (1997a) discussed systemic flows of information in the health care industry and proposed specific criteria for universal patient identifiers. In particular, the report called for technical mechanisms that would help control linkages among health care databases held by different organizations, reveal when unauthorized linkages were made, and support appropriate linking. Because Internet connectivity greatly facilitates such linkages, it is appropriate to renew the call for research into such mechanisms in the present report.
As the discussion in this chapter demonstrates, ongoing efforts to enhance the capabilities of the Internet will produce many benefits for the health community. They will provide mechanisms for offering QOS guarantees, better securing health information, expanding broadband access options for consumers, and protecting consumer privacy. At the same time, the technologies expected to be deployed across the Internet in the near future will not fully meet the needs of critical health care applications. In particular, QOS offerings may not meet the need for dynamically variable service between communicating entities. Security technologies may not provide for the widespread issuance of certificates to health care consumers. And the Internet will not necessarily provide the degree of reliability needed for mission-critical health applications. Although much can be done with the technologies currently planned, additional effort will be needed to make the Internet even more useful to the health community.break
One way to ensure that health-related needs are reflected in networking research and development is to increase the interaction between the health and technical communities. As researchers attest, most networking research is conducted with some potential applications in mind. Those applications are shaped by interactions with system users who can envision new applications. To date, interaction between health informatics professionals and networking researchers has been limited. By contrast, the interests of industries such as automobile manufacturing and banking are well represented within the networking community, in part because of their participation in the IETF and other networking groups. The health community may need to better engage these groups to ensure that health interests are considered.
Birman, K.P. 1999. The Next Generation Internet: Unsafe at Any Speed? Department of Computer Science Technical Report, Draft of October 21. Cornell University, Ithaca, N.Y.
Blake, S., et al. 1998. An Architecture for Differentiated Services. IETF Request for Comment (RFC) 2475, December.
Bleichenbacher, D., E. Gabber, P. Gibbons, Y. Matias, and A. Mayer. 1998. "On Secure and Pseudonymous Client-Relationships with Multiple Servers," pp. 99-108 in Proceedings of the Third USENIX Electronic Commerce Workshop, Boston, September.
Braden, R., S. Shenker, and D. Clark. 1994. Integrated Services in the Internet Architecture: An Overview. IETF Request for Comment (RFC) 1633, June.
Braden, R., L. Zhang, S. Berson, S. Herzog, and S. Jamin. 1997. Resource ReSerVation Protocol (RSVP): Version 1 Functional Specification, IETF Request for Comment (RFC) 2205, September.
Chaum, D. 1981. "Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms," Communications of the ACM 24(2):84-88.
Chaum, D. 1989. "Privacy Protected Payments: Unconditional Payer and/or Payee Untraceability," pp. 69-93 in Proceedings of SMARTCARD 2000, D. Chaum and I. Schaumuller-Bichl, eds. North-Holland, Amsterdam.
Clark, D. 1999. "The Internet of Tomorrow," Science 285(July 16):353.
Clark, D., and J. Wroclawski. 1997. An Approach to Service Allocation in the Internet. IETF Draft Report, July. Massachusetts Institute of Technology, Cambridge, Mass. Available online at <http://diffserv.lcs.mit.edu/Drafts/draft-clark-diff-svc-alloc-00.txt>.
Computer Science and Telecommunications Board (CSTB), National Research Council. 1994. Realizing the Information Future: The Internet and Beyond. National Academy Press, Washington, D.C.
Computer Science and Telecommunications Board (CSTB), National Research Council. 1996. The Unpredictable Certainty: Information Infrastructure Through 2000. National Academy Press, Washington, D.C.
Computer Science and Telecommunications Board (CSTB), National Research Council. 1997a. For the Record: Protecting Electronic Health Information. National Academy Press, Washington, D.C.break
Computer Science and Telecommunications Board (CSTB), National Research Council. 1997b. Modeling and Simulation: Linking Entertainment and Defense. National Academy Press, Washington, D.C.
Computer Science and Telecommunications Board (CSTB), National Research Council. 1999. Trust in Cyberspace. National Academy Press, Washington, D.C.
Cox, L.H. 1988. "Modeling and Controlling User Inference," pp. 167-171 in Database Security: Status and Prospects, C. Landwehr, ed. North-Holland, Amsterdam.
Denning, D.E., and J. Schlörer. 1983. "Inference Controls for Statistical Databases," IEEE Computer 16(7):69-82.
Denning, D.E., P.J. Denning, and M.D. Schwartz. 1979. "The Tracker: A Threat to Statistical Database Security," ACM Transactions on Database Systems 4(1):76-96.
Dierks, T., and C. Allen. 1999. The TLS Protocol Version 1.0. IETF Request for Comment (RFC) 2246, January.
Ellison, Carl, and Bruce Schneier. 2000. "Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure," Computer Security Journal 16(1):1-7.
Goldberg, I., and D. Wagner. 1998. "TAZ Servers and the Rewebber Network: Enabling Anonymous Publishing on the World Wide Web," First Monday 3(4). Available online at <http://www.rewebber.com>.
Halabi, B. 1997. Internet Routing Architectures. Cisco Press, Indianapolis, Ind.
Hawley, G.T. 1999. "Broadband by Phone," Scientific American 281(4):102-103.
Health Privacy Working Group. 1999. Best Principles for Health Privacy. Institute for Health Care Research and Policy, Georgetown University, Washington, D.C.
Huitema, C. 1995. Routing in the Internet. Prentice-Hall, Englewood Cliffs, N.J.
Institute of Medicine (IOM). 1997. The Computer-Based Patient Record: An Essential Technology for Health Care, rev. ed. Dick, R.S., E.B. Steen, and D.E. Detmer, eds. National Academy Press, Washington, D.C.
Jacobson, V. 1988. "Congestion Avoidance and Control," Computer Communication Review 18(4):314-329.
Kent, S., and R. Atkinson. 1998a. Security Architecture for the Internet Protocol. IETF Request for Comment (RFC) 2401, November.
Kent, S., and R. Atkinson. 1998b. IP Authentication Header. IETF Request for Comment (RFC) 2402, November.
Kent, S., and R. Atkinson. 1998c. IP Encapsulating Security Payload (ESP). IETF Request for Comment (RFC) 2406, November.
Marbach, W.D. 1983. "Beware: Hackers at Play," Newsweek 102(September 5):42-46.
Paxson, V. 1997. "End-to-End Routing Behavior in the Internet," IEEE/ACM Transactions on Networking 5(October):601-615.
Perlman, R. 1992. Interconnections: Bridges and Routers. Addison-Wesley, Reading, Mass.
Peterson, L., and B. Davie. 2000. Computer Networks: A Systems Approach. Morgan Kaufmann, San Francisco.
Reed, M.G., P.F. Syverson, and D.M. Goldschlag. 1998. "Anonymous Connections and Onion Routing," IEEE Journal of Selected Areas in Communication 16(4):482-494.
Reiter, M.K., and A.D. Rubin. 1998. "Crowds: Anonymity of Web Transactions," ACM Transactions on Information Systems Security 1(1):66-92.
Reuters. 1999. "AMA, Intel to Boost Online Health Security," October 13.
Schlörer, J. 1981. "Security of Statistical Databases: Multidimensional Transformation," ACM Transactions on Database Systems 6(1):95-112.
Shenker, S. 1995. "Fundamental Design Issues for the Future Internet," IEEE Journal of Selected Areas in Communication 13(7):1176-1188. Available online at <www.lcs.mit.edu/anaweb/pdf-papers/shenker.pdf>.
Skoro, J. 1999. "LMDS: Broadband Wireless Access," Scientific American 281(4):108-109.break
Sweeney, L. 1998. "Datafly: A System for Providing Anonymity in Medical Data," in Database Security XI: Status and Prospects, T.Y. Lin and S. Qian, eds. Chapman & Hall, New York.
Zero Knowledge Systems, Inc. 1998. The Freedom Network Architecture, Version 1.0. Available from ZKS, 3981 St. Laurent Blvd., Montreal, Quebec, Canada. December.
Zimmerman, Philip. 1994. The Official PGP Users Guide, Technical report. MIT Press, Cambridge, Mass.
1. Evidence of such latencies can be seen in data collected by the National Laboratory for Applied Network Research, which are available at <http://www.nlanr.net>.
2. ISPs typically have POPs in major urban areas; a large provider might have 30 or more POPs in the United States.
3. The 30 Tbps figure was calculated by multiplying the number of strands per fiber (30) by the number of wavelengths that can be transmitted over each fiber (100) and the capacity of each fiber at each wavelength (10 Gbps). A terabit is 1012 (one thousand billion) bits per second.
4. SONET is a standard developed by telephone companies for transmitting digitized voice and data on optical fibers.
6. The 10 Gbps figure results from multiplying 10 Mbps by 1,000 applications (10 Mbps × 1,000 = 10 Gbps).
7. For example, even if available bandwidth were 10 times greater than the average required, the load on certain links over short time periods could be large enough to impose large delays over those links.
8. IP is a connectionless, packet-switching protocol that serves as the internetwork layer for the TCP/IP protocol suite. It provides packet routing, fragmentation of messages, and reassembly.
9. Because of its reliance on RSVP, the int-serv model sometimes is referred to as the RSVP model.
10. With RSVP, the load on the router can be expected to increase at least linearly as the number of end points increases. Growth may even be quadraticrelated to the square of the number of end points (Birman, 1999).
11. An example of a scaling issue for today's ISPs is the size of routing tables, which currently hold about 60,000 routes (address prefixes) each. Entries in the routing table consume memory, and the processing power needed to update tables increases with their size. It is important that such tables grow much more slowly than do the numbers of users or individual applications, making it infeasible to store RSVP information if it grows in direct proportion to the number of application flows.
12. The charter of the Integrated Services Over Specific Link Layers working group of the IETF is available online at <http://www.ietf.org/html.charters/issll-charter.htm>.
13. The Department of Defense has a long-standing interest in using multicast technology to support distributed simulations. See CSTB (1997b).
14. One of the more notorious cases occurred when the "414" group broke into a machine at the National Cancer Institute in 1982, although no damage from the intrusion was detected. See Marbach (1983).
15. Unix's Network File System (NFS) protocol, commonly used to access file systems across an Internet connection, has weaknesses that enable a "mount point" to be passed to unauthorized systems. Surreptitious programs called Trojan horses can be exploited to perform actions that are neither desired by nor known to the user.break
16. Most U.S. health care providers continue to maintain patient records on paper, but current trends in clinical care, consumer health, public health, and health finance all indicate a shift to electronic records. Without such a shift, the health community's ability to take full advantage of improved networking capabilities would be severely limited. With such a shift, the need for convenient, effective, and flexible means of ensuring security will be paramount.
17. Tools such as Back Orifice can enable a hacker using the Internet to remotely control computers using Windows 95, Windows 98, or Windows NT. Using Back Orifice, hackers can open and close programs, reboot computers, and so on. The Back Orifice server has to be willingly accepted and run by its host before it can be used, but it is usually distributed claiming to be something else. Other such clandestine packages also exist, most notably Loki.
18. For a discussion of key distribution centers, see CSTB (1999), pp. 127-128.
19. For a discussion of some of the limitations of PKI systems, see Ellison and Schneier (2000).
20. It should be noted that when using SSL, data are decrypted the moment they reach their destination and are likely to be stored on a server in unencrypted form, making them vulnerable to subsequent compromise. A number of approaches can be taken to protect this information, including reencryption, which presents its own challenges, not the least of which is ensuring that the key to an encrypted database is not lost or compromised.
21. Whereas one organization may issue a certificate to anyone who requests one and fills out an application, another may require stronger proof of identity, such as a birth certificate and passport. These differences affect the degree of trust that communicating parties may place in the certificates when they are presented for online transactions.
22. Additional information on the Intel initiative is available online at <http://www.intel.com/intel/e-health/>.
23. Participating organizations in the HealthKey initiative are the Massachusetts Health Data Consortium, the Minnesota Health Data Institute, the North Carolina Healthcare Information and Communications Alliance, the Utah Health Information Network, and the Community Health Information Technology Alliance, based in the Pacific Northwest. Additional information on the program is available online at <http://www.healthkey.org>.
24. Users can do this, for example, by registering their public keys with a public facility, such as the PGP key server at the Massachusetts Institute of Technology.
25. Computer scientists generally consider system (or network) availability to be an element of security, along with confidentiality and integrity. As such, availability is discussed within the security section of this chapter. Other chapters of this report discuss availability as a separate consideration to highlight the different requirements that health applications have for confidentiality, integrity, and availability.
26. Cable modems and DSL services are typically not attractive to businesses, either because the number of connected hosts (IP addresses) is limited or the guaranteed minimum delivered bandwidth is low. In the San Francisco Bay Area, asymmetric DSL delivers anywhere from 384 kbps to 1.5 Mbps, depending on many factors. In other areas, DSL with 256 kbps/64 kbps down/up link speed costs approximately $50 per month, but the costs skyrocket quickly to roughly $700 per month for 1.5 Mbps/768 kbps down/up link speeds.
27. Quality of service mechanisms, such as integrated services, might help ameliorate contention for cable bandwidth, but only if the technology is widely deployed.
28. There is at least one way to hide the identity of the sender: All e-mail applications can be spoofed.
29. Intel Corporation introduced an identifying number into its Pentium microprocessors to help servers identify client machines in the hopes of facilitating electronic commerce. Public concern over the privacy implications of this capability caused the company to take the additional step of providing a means to prevent the number from being revealed.
30. This is essentially what a filtering firewall does: hides the identities (IP addresses) of those behind it.break