Page 307

On Functions

Computer-Mediated Collaboration

Loren Terveen

At&T Research

Interface...Interaction...Collaboration

A narrow view of the human-computer interface focusing on superficial "look-and-feel" issues is unproductive. It offers neither deep understanding nor practical design guidance. Even simple interface decisions may require significant knowledge about people's interaction with a system. Three interfaces provide practical examples: the popcorn button on microwave ovens, the VCR (video cassette recorder)+ system, and the ATM (automated teller machine) fast cash withdrawal button. Each of these interfaces was added years into the product cycle in response to people's actual use of the products. At a theoretical level, Hutchins et al. (1986) couched their seminal analysis of direct manipulation interfaces in terms of users' cognitive situation and resources, a general model of tasks, and the coupling between user goals and interface features. Their analysis shows why interface design decisions cannot be made on the basis of look and feel alone.

Indeed, we also begin to see that people may require even more from



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 307
Page 307 On Functions Computer-Mediated Collaboration Loren Terveen At&T Research Interface...Interaction...Collaboration A narrow view of the human-computer interface focusing on superficial "look-and-feel" issues is unproductive. It offers neither deep understanding nor practical design guidance. Even simple interface decisions may require significant knowledge about people's interaction with a system. Three interfaces provide practical examples: the popcorn button on microwave ovens, the VCR (video cassette recorder)+ system, and the ATM (automated teller machine) fast cash withdrawal button. Each of these interfaces was added years into the product cycle in response to people's actual use of the products. At a theoretical level, Hutchins et al. (1986) couched their seminal analysis of direct manipulation interfaces in terms of users' cognitive situation and resources, a general model of tasks, and the coupling between user goals and interface features. Their analysis shows why interface design decisions cannot be made on the basis of look and feel alone. Indeed, we also begin to see that people may require even more from

OCR for page 307
Page 308 systems, namely help with tasks they don't know enough about to do on their own. Norman (1986) observed that the Pinball Construction Set makes it easy to design computerized pinball games but not good games; this takes knowledge about pinball design. More generally, Schoen (1983) discussed how skilled professionals can interpret the state of their work objects to make good decisions; they act and the situation "talks back." The problem is that less skilled people may not be able to understand what the situation is "saying." Fischer and Reeves (1992) studied interactions between customers and sales agent in a large hardware store. They identified crucial knowledge only the sales agents possessed, which they used to help customers. The knowledge included knowing that a tool existed, how to find a tool, the conditions under which a particular tool should be used, and how to combine tools for a specific situation. People often work together on tasks. Thus, in addition to collaborating with users, another appropriate role for systems is to support human collaboration. The field of computer-supported cooperative work (CSCW) seeks to understand the nature of joint work and design technologies to support it. Important technologies include shared editors, group discussion support tools, and awareness systems. Even when people do not work together explicitly, they still can benefit from the prior experience and opinions of others. Computational techniques for mining such information and turning it into a reusable asset raise the potential for a form of "virtual collaboration," with some of the benefits of collaboration without the costs of communication or personal involvement. To summarize, there are three fundamental motivations for collaborative systems and a research approach built on each one: • Tasks require specialized skills and knowledge -> Intelligent collaborative agents • Work is inherently social -> Computer-supported cooperative work • People can reuse the experience of others -> Virtual collaboration Next I discuss the prospects for collaboration in common tasks supported by the national information infrastructure (NII). The Nii-What People Use It For, Where Collaboration Is Needed The change from stand-alone to networked computers is transforming computers from desktop tools into windows on the world, from information containers and processors into communication devices. The

OCR for page 307
Page 309 World Wide Web is the primary innovation ushering ordinary citizens into this new world, so much of my discussion focuses on the Web. The World Wide Web was designed expressly to support communication and collaboration among geographically distributed colleagues (Berners-Lee et al., 1994). Specifically, it supports information sharing, with the dual aspects of publishing and finding information. As the Web has expanded to embrace a diverse population of users and a broad range of uses, more activities have become important: • Person-person communication (e.g., through e-mail or "newsgroups"; entertainment, arts, and advertising; from Web sites for the latest movies to high-quality on-line magazines to serious (or not-so-serious) artistic sites). • Commerece-offering items for sale, finding items that match one's interests; brokering between buyers and sellers. • Education-for example, on-line course materials, interactive tutorials, and distributed science experiments. Let us next consider the role of collaboration in these activities: • Information sharing. Information seekers need assistance in finding high-quality, relevant information in the vast, ever-changing sea of Web sites. Information publishers need assistance in designing functional and attractive interactive applications. • Person-person communication. All the major CSCW issues arise here (e.g., shared document access, discussion support, awareness aids). • Entertainment, arts, and advertising. There is great potential for computational agents in interactive fiction, social role-playing environments, and games (Lifelike Computer Characters Conference-http://www.research.microsoft.com/lcc.htm; Maes, 1995). • Commerce. Computational match-making agents can bring together buyers and sellers. Support for communication protocols such as auctions also is important. • Education. Computational agents can engage learners. Teachers and students need support for communicating and working together (e.g., to complete assignments and carry out experiments). Computer-Mediated Collaboration: A Unifying Perspective A unified research framework offers two main benefits: (1) it advances communication and understanding among researchers by helping

OCR for page 307
Page 310 them to share and compare methods and results, and (2) it makes it easier to explore designs that integrate different types of collaboration. I propose a perspective of "computer-mediated collaboration"-people collaborating with people, mediated by computation. A given instance of computer-mediated collaboration can be characterized by using the following dimensions: • roles and responsibilities of the human participants; • nature of the computational mediation, including;   - how information is acquired, processed, and distributed;   - whether the information evolves during (and in response to) system usage;   - temporal properties of the mediation (e.g., synchronous vs. asynchronous, time delays); and   - nature of the human-computer interaction. This framework is adequate for describing CSCW and virtual collaboration; both of these explore computational techniques for mediating human collaboration. As applied to collaborative agents, it highlights the involvement of the people who create the agents, both domain experts whose knowledge is modeled in the agents and knowledge engineers (or artificial intelligence researchers) who work with the experts to articulate the knowledge and develop representations and algorithms for using it. It also reminds us of the time and resource costs of the design process. More deeply, the framework guides us to consider combinations of various types of collaboration. For example, users of a computational agent may not think about its designers when things work; however, when the user-agent interaction breaks down, an effective remedy may be to provide the user access to a knowledgeable human expert, such as the domain expert involved in designing the agent (Terveen et al., 1995). Or when an agent has inadequate knowledge to perform a task on behalf of its user, it might be able to obtain assistance from other agents (Lashkari et al., 1994). Research Issues Dividng Responsibility Among People and Computational Agents People and computers have fundamentally different abilities. Thus, a basic issue is creating divisions of responsibility that maximize the strengths and minimize the weaknesses of each (Terveen, 1995). "Critics" (Fischer et al., 1993) represent a well-known approach that responds to this issue. Critics are agents who observe users as they work in a computational

OCR for page 307
Page 311 environment and offer assistance from time to time. Users are responsible for the overall course of the work, while critics use domain expertise to help users solve problems and evolve their conception of the problem. While much interesting work has been done in this area, most of it still consists of proof-of-concept explorations. The next step is to develop robust generalizations that can be embedded in toolkits. Collecting and Evaluating Data Necessary for Virtual Collaboration Two major approaches to virtual collaboration have been explored. Systems like the Bellcore Recommender (Hill et al., 1995) and Firefly (http://www.firefly.com) ask users to rate objects of interest, such as movies or music. The systems maintain a database of raters and their ratings, compute similarities among raters, and recommend objects to people that were rated highly by other people with similar tastes. Datamining approaches (Hill and Hollan, 1994; Hill and Terveen, 1996) attempt to extract useful information automatically from people's normal activities, such as reading and editing documents or discussing topics on netnews. (One goal is to require little or no extra data entry from users.) Abstracted versions of this information are then made available to other people engaged in the same activity. One of the major issues for both types of approaches is obtaining the necessary information. For ratings systems the question is: Will enough people rate? For data-mining systems, the questions include: Can useful information be extracted automatically? Can it be extracted efficiently (important since quality often comes from aggregating over large amounts of data)? Can it be extracted and reused without violating the privacy of the people who produced it? Once data-recommendations or ratings-are available, the problem is to evaluate them. One good way to do this is to consider the source; some people are more credible for any given topic. Therefore, computing a person's credibility from available data is a second major problem. One complication is that most interaction on the World Wide Web is anonymous; if one cannot even attribute particular actions or opinions to a person, it is hard to compute his or her credibility. This again raises a potential conflict in values between the privacy of on-line interaction and the attempt to mine information that could be used to enhance interaction. The credibility problem can be further refined into that of determining good sources (raters/recommenders) for a specific person. Developing effective algorithms for this is precisely what the ratings approach does. However, the problem is harder for data-mining approaches: they

OCR for page 307
Page 312 operate only on already-available data, and existing data may not always be an adequate source for computing similarities among people. Introducing Computational Agents into On-line Communities When an agent participates in an on-line community, such as a newsgroup or text-based virtual reality (e.g., a MUD or MOO), interesting issues arise beyond those faced in single-user human-computer collaboration. I illustrate these issues using PHOAKS (Hill and Terveen, 1996), which serves as a group memory agent that maintains recommended Web pages for a group. • Will the community accept the agent's participation? Every community has behavioral norms. An agent ought not violate these norms (the norms for an agent may well be different than those for people). Some concern has been expressed that the PHOAKS ranking of Web resources by frequency of mention might distort community behavior (e.g., including people to post many spurious messages recommending their favorite resources). Thus, one must consider not only whether an agent respects community norms but also whether its participation may cause others to violate the norms. • Does the agent make a useful contribution to the community? In Foner's (http://foner.www.media.mit.edu/people/foner/Julia/Julia.html) discussion of the interesting social characteristics of the "Julia" agent, he points out that "she" serves useful functions, including taking and delivering messages, giving navigation advice, and sharing gossip. We also should consider to whom an agent is to be useful (e.g., community insiders, new-comers, or outsiders), especially since their interests may be different. For example, PHOAKS could make it easy to contact community participants by e-mail (or even by telephone). While outsiders might find this an attractive way to get information, presumably the community participants would be displeased. • Will the community help make the agent smarter? An agent begins its participation in a community with some knowledge. If the agent has the capability to learn, and the community will offer necessary input the agent can improve over time. For example, PHOAKS maintains a ranked set of Web pages for each newsgroup based on its categorization of URL mentions in messages. However, it will miscategorize some URLs, and some important URLs may not be mentioned or may be mentioned infrequently, perhaps because they are so well known (e.g., they may be in the FAQ). Therefore, PHOAKS contains forms that allow people to give opinions on the Web pages it links to and to add additional links. The general problem is to create techniques that let the system obtain

OCR for page 307
Page 313   performance-enhancing feedback and that people are willing and able to use. Or machine learning techniques may be used that let agents learn on their own. • Who stands behind the agent? Sometimes community members want to talk to the people behind the agent. Maybe they want more information, or maybe the agent has done something that makes them angry. We have seen both in PHOAKS. People ask questions about the topic of a newsgroup, like where to find a bagpiper. People complain about the way PHOAKS has categorized Web pages; for example, in rare cases a condemnation of a Web page (e.g., from a hate group) is categorized as a recommendation. Sometimes, we have manually changed the PHOAKS databases, and less frequently we have modified the categorization algorithms or interface. The general problem is how to provide needed human backup for agents who may be participating in many (e.g., thousands of) different communities at once. Conclusion I would like to conclude with two claims. First, if we take the argument of this paper seriously, we need not one but many every-citizen interfaces to the NII. It is specific appropriate types of computer-mediated collaborations that have the potential to increase the access and power of ordinary citizens, not a standard look-and-feel. Second, research must move into the real world. Many of the PHOAKS issues discussed here are ones not anticipated, but discovered only by wading into the uncontrolled, unpredictable, messy World Wide Web. We have been able to formulate issues, hone our tools, and evaluate our results in ways that we could not have done if we had stayed in our laboratories. At some stage, all promising new research ideas will have to take the same plunge to prove their benefits to the ordinary citizen engaging in life on the NII. Acknowledgments I thank Will Hill for our collaboration on PHOAKS and for our many conversations developing and exploring the issues mentioned here. References Berners-Lee, T., Cailliau, R., Luotonen, A., Nielsen, H.F., and Secret, A. (1994) The WorldWide Web. Communications of the ACM, 34(12), 321-347. Fischer, G., and Reeves, B. (1992) Beyond Intelligent Interfaces: Exploring, Analyzing, and Creating Success Models of Cooperative Problem Solving. Applied Intelligence, 1, 311-332.

OCR for page 307
Page 314 Fischer, G., Nakakoji, K., Ostwald, J., Stahl, G., and Sumner, T. (1993) Embedding Critics in Design Environments. The Knowledge Engineering Review Journal, 4(8), 285-307. Hill, W.C., and Hollan, J.D. (1994) History-Enriched Digital Objects: Prototypes and Policy Issues. The Information Society, 10, 139-145. Hill, W.C., Stead, L., Rosenstein, M., and Furnas, G. (1995) Recommending and Evaluating Choices in a Virtual Community of Use. Pp. 194-201 in CHI'95. ACM Press, New York. Hill, W.C., and Terveen, L.G. (1996) Using Frequency-of-Mention in Public Conversations for Social Filtering. CSCW'96. ACM Press, New York. (See also http://www.phoaks.com/phoaks/) Hutchins, E.L., Hollan, J.D., and Norman, D.A. (1986) Direct Manipulation Interfaces. Pp. 87-124 in Norman, D.A., and Draper, S.W., Eds., User Centered System Design. Erlbaum, Hillsdale, N.J. Lashkari, Y., Metral, M., and Maes, P. (1994) Collaborative Interface Agents. In AAAI'94. AAAI Press, Seattle, Wash. Maes, P. (1995) Artificial Life Meets Entertainment: Interacting with Lifelike Autonomous Agents. Communications of the ACM, 38(11), 108-114. Norman, D.A. (1986) Cognitive Engineering. Pp. 31-61 in Norman, D.A. and Draper, S. W., Eds., User Centered System Design. Erlbaum, Hillsdale, N.J. Schoen, D. (1983) The Reflective Practitioner. Basic Books, New York. Terveen, L.G. (1995) An Overview of Human-Computer Collaboration. Knowledge-Based Systems, 8(2-3), 67-31. Terveen, L.G., Selfridge, P.G., and Long, M.D. (1995) Living Design Memory: Framework, System, and Lessons Learned. Human-Computer Interaction, 10(1), 1-37.

OCR for page 307
Page 315 Creating Interfaces Founded On Principles Of Discourse Communication And Collaboration Candace Sidner Lotus Development Corporation Today's user interfaces are just too hard to use: they are too complex even for the narrow range of users for whom they were designed. At the same time, they also are impoverished in the range of modalities which they provide to users. While new modalities are becoming available, they could make interfaces even harder to use. What's the solution to this dilemma? Principles of human discourse communication and of human-to-human collaboration are two critically overlooked sources for simplifying interfaces. They offer a means of integrating various modalities and of extending the range of computer users. Until recently user interface technology has not made use of what is now understood about the principles of discourse that govern human communication or the principles of collaboration that model joint work. This may seem surprising because interfaces are "communication engines" to the functionality software applications; interfaces are how we get our work done. While the field of computer-supported cooperative work has directed the majority of its concerns at understanding how computers can be used to help people work together, the computer has not been seen as a full-fledged partner in the human collaboration. Interfaces are designed to make collaboration between people better and to some extent they succeed, but the computer is not a collaborator with any of the people. The current model of communication in interfaces is rudimentary at best. It is the "interaction" model, which is to say the user invokes a command and gets some, perhaps expected, performance by the computer, rather like when one's dog does a trick on the basis of a command such as "roll over." To communicate, users must choose one- or two-word commands from a menu with a mouse or incant a line of mumbo-jumbo that is meant to command the computer to run a program. Any clarification with the user results from the user responding to "dialogue boxes." This interaction model of communication is, in the weakest sense, a dialogue: some information flows between two agents who are capable of acting on that information. While an interface to a given application may have hundreds of so-called dialogue boxes, dialogue in the human

OCR for page 307
Page 316 sense does not take place. There is no structure to the overall dialogue between user and interface from one dialogue box to the next and no memory of past dialogues or commands. Each command and action pairing is taken as completely independent of the next, so that there is no overall organization around the purposive intent of the overall set of ''interactions" between the user and the machine. Just as a dog doesn't always do what you tell it to, computers don't either. The interface is meant to inform users about what the computer can do, but, as we all know, short phrases are especially ambiguous in human language. Users have little means to resolve this ambiguity. If the meaning of a command is not obvious to them, they can at best try it out and hope that it does what they want, or they can make their way through a help system to determine if they are on the right track. All the while, they are required to be very explicit about every reference to objects, such as a files, that they make. While the user bears the burden of being explicit, the interface often communicates with ambiguity back to the user. For example, what should the user conclude is the meaning of the "ok" button in a dialogue box? "Yes, that's fine with me, I agree," or "I understood the words," or "well, I read the words" are possible, though in human discourse, these uses of "ok" convey very different responses to the content of utterances in the dialogue. Because users cannot communicate these distinctions, it becomes clear to them that the computer interface does not really know what it is doing. It's just a dumb machine. Being able to be a collaborator is three steps up on the ladder of communication and work. The first is minimal interaction. Today's interfaces do not pass the "minimality test" because they do not know enough to do so. Only the user does any modeling or remembering of the interaction and its parts. Whatever role the machine has played in the interaction it completely forgets when it completes the action requested. It also is completely unaware of any difficulty the user may have had in determining the meaning of a command. Capturing this level of interaction provides a bare minimum of interactive understanding-the interface would have a more complete model of the back-and-forth nature of the communication than it does now even if it did not know why the user wanted to communicate in the first place. The second step on the ladder of communication and collaboration is slave-like interaction. To perform this way, the interface must have a model of what the user wants to do. Current interfaces do not have such a model. The user's goals and tasks lie completely outside of interface, and there is no means to say anything about them. No part of the user's goals and tasks is recorded or even recognized.1 Instead, all of this information must reside only in the head of the user. None of it can be found in the application and its interface.

OCR for page 307
Page 317 Some interfaces seem to be useful and quite satisfying to users. The metaphors on which they are based are highly predictive for users in determining what to do next. One such example is the interface for checkbook activities. I believe this is because the metaphor has been used to build in a model of the task the user is doing and to represent aspects of the task. The metaphor has also been used to keep the user narrowly focused on the task at hand-to balance a checkbook, write checks, or produce reports based on the checkbook information. As a result, the interface metaphor helps users work and also helps them predict what the interface is likely to do. While the interface is not aware of the task the user is doing, it is designed to do that task and to keep the user highly focused. While it would be wise to continue to design interfaces carefully using well-thought-out metaphors, it will not solve the larger problems concerning interactivity, communication, and collaboration. No one metaphor is powerful enough for all work. Furthermore, lots of smaller applications each with an interface for performing one set of tasks leaves the user with lots of tasks to juggle. We still need an interface that communicates and collaborates, one that's at step three on the ladder. How do we get there from here? There is a great deal more known about human discourse communication that could be used in interfaces today. Recent work in linguistics, natural language processing, and psychology offers principles of communication that can be embodied in interfaces, even when they do not speak full human language. All discourse, of which dialogue is an example, is purposive behavior, and the structure of the discourse is organized and segmented according to purpose. The focus of attention of the discourse is used, among other things, to provide context, which means creating locality in the segments of the discourse for interpreting recent references and to help discourse participants assure that each of them is paying attention to the same items in the discourse (Grosz and Sidner, 1986). Grounding of utterances in the human-computer dialogue2 makes conversation more efficient by allowing people to leave out what is truly obvious to both participants, as well as to slow the conversation down in order to reestablish focus, correct for unwanted ambiguity, and determine the next participant who has the floor. It is possible to build interfaces that make use of these principles (and associated algorithms, which I will not discuss here) while at the same time simplifying the interface itself. We are doing that in our current work on collaborative interface agents (Rich and Sidner, 1996). To do so, designers will need to think in terms of user purposes (not just what actions the interface permits), the structure of purposes, and the relationship between what the user must communicate and the purposes of the communication. Maintenance of focus of attention will provide users

OCR for page 307
Page 364 Citizen interaction with organizations, both businesses and government agencies, used to require face-to-face meetings, filing of handwritten forms, or telephone calls. Automated touch-tone response systems, tied to databases and enabled with synthesized voice response technology, have greatly increased the range of information that citizens can access and have expanded the times at which such data can be accessed. Credit card balances, frequent flier account information, and tax refund status can all be checked through calls to toll-free numbers. Stock and mutual fund trades and bank fund transfers can be initiated via similar means, all without direct interaction with a human being and on a 24-hour basis. Today, many of the systems that have provided automated telephone access capabilities are moving to Internet-enabled access. This provides a much more powerful and convenient interface, enabling a wider range of data access with faster response and interaction characteristics. Businesses and government agencies are moving rapidly to make information available over the Internet, via the World Wide Web. Massachusetts now supports payment of traffic tickets over the Web, as a first step toward making government more responsive and accessible to citizens. Businesses of every stripe, from financial institutions to mail-order catalog merchants, are providing client access via the Web, in addition to the telephone. A Vision Of Near- And Intermediate-Term NII Interactions Within the next 5 to 10 years significant improvements in many NII element interfaces will be implemented and widely deployed. The use of technologies such as cable modems, ADSL, and ISDN will significantly improve Internet access speeds for residential users. Continuing improvements in computer technology will increase local processing speed, enabling more sophisticated user interface software. The advent of very low-cost computers, designed specifically for Web browsing and using a television as a video interface, promises to expand the subscriber base into many more households. Using this model, one can imagine an NII in which citizen interaction with government agencies, businesses, and with one another make substantial use of this environment. Requests for generic data from a vast array of government databases can be made and instantaneously satisfied via Web browsers interacting with servers coupled to massive databases. Interactions for a variety of personal transactions with government agencies also will be enabled (e.g., filing tax forms, making tax payments, or checking one's Social Security account status). Many catalogs and periodicals now delivered in hard-copy form can

OCR for page 307
Page 365 be delivered via the Web, as some already are. After browsing on-line catalogs, clients will place orders for items to be shipped via the postal system or express delivery services, all via the Web. All forms of financial transactions (e.g., credit and debit card purchases, checks, cash exchanges, stock and mutual fund transactions) will be available via the Internet, and many will make use of these facilities instead of hard-copy instruments. Security, Interfaces, And The NII As elements of the NII evolve, as described above, they offer increased functionality and improved performance, usually at lower prices. However, security and privacy concerns often are overlooked in this rush to enhance the NII. Cellular phone calls are not only easy to intercept, but the account information used for billing is even easier to acquire via automated means. It has been estimated that the lack of attention to security, if only for this billing authorization information, has cost the cellular phone industry hundreds of millions of dollars in lost revenue. Digital paging systems are highly vulnerable to interception, raising privacy concerns. Theft of service for cable and satellite TV delivery systems is often decried as depriving those industries of significant amounts of revenue. The Caller ID feature for telephone systems is both a blessing and a curse, from a privacy perspective. As companies have connected corporate computer and network systems to the Internet to facilitate user access, the overall security of these systems has often been degraded. Most of these Internet connections are secured by firewalls, a technology that usually constrains the Internet so as to reduce its capability (even for authorized users) and which ultimately fails to provide high-quality security for the computers being protected. The ease with which electronic mail can be intercepted or forged is appalling. As the first tentative steps are taken toward consumer-level Internet electronic commerce, the on-line literature is replete with examples of technical opportunities for fraud. For many of these NII elements the technology for improving security and privacy has been available for some time, but often it has not been implemented. Sometimes the reasons are purely economic (e.g., the cost of adding security technology is perceived to make the resulting product noncompetitive). Sometimes time-to-market concerns prevent incorporation of security features (i.e., the delay imposed by adding security features would allow a competitor to offer a product or service sooner and thereby capture market share). However, in some cases the difficulty of providing a good user interface for security technology has been a major impediment.

OCR for page 307
Page 366 As underlying communications and computing systems become more complex, there is a natural tendency for the user interface to become more complex, though that need not always be the case. For example, WIMP (windows, icons, menus, pointers)-based operating system interfaces can mask substantial underlying complexity, as illustrated by the contrast between the Apple Macintosh and DOS interfaces at corresponding points in time. However, within the context of a paradigm such as WIMP, increased functionality often results in increased complexity for users, as both Windows 95 and Mac users can attest. Computer systems have become more complex, and network interactions have become commonplace in the desktop and laptop systems that users employ in home environments. Providing security for such systems has become increasingly difficult. In the 1970s and 1980s much research was devoted to the development of secure operating systems, primarily for use with multiuser systems (e.g., time-sharing systems and servers). However, all of this research into secure operating systems yielded very little that has been commercially successful or widely deployed. Today, operating systems for the desktop computers most commonly used in home environments (e.g., Windows 95 and Mac), have very few security features. Yet these may be the models for the systems that citizens will most commonly employ in their interactions with the NII. An alternative model, suggested by the "network computer" paradigm promoted by companies such as Oracle and Sun, is a Java interpreter and a Web browser as the operating system replacement. Given the many security problems that have been discovered in Web browsers such as the Netscape Navigator and the rash of Java-based security problems that have been described in the literature, this is hardly an encouraging alternative paradigm. In either case, networked computers of some sort will provide an every-citizen interface to many NII elements. There are fundamental and difficult problems associated with developing highly functional and secure networked computer systems; these problems are exacerbated when there is a requirement to make the systems easy to use by all citizens. What Is The Hard Problem The fundamentally hard problem, as alluded to above, is one of trying to make an increasingly complex system, operated by untrained users, secure in the face of attacks by sophisticated adversaries. Various aspects of this problem are examined below. As noted above, firewalls are typically used in corporate environments to provide "secure" connectivity to the Internet. One of the major reasons for adopting this strategy is that those responsible for corporate

OCR for page 307
Page 367 computer security find themselves unable to effectively manage security for individual desktop computers. Instead, by inserting a firewall at the perimeter of the corporate network, the site security administrator can focus his or her attention on managing a single (or small number) of computers devoted to a well-defined and limited task (i.e., controlling the flow of Internet traffic across the security perimeter). In contrast, security management of individual desktop systems is hard because these systems are often directly under the control of users, executing a wide range of software, and based on operating systems that are insecure out of the box. The control afforded by firewalls is in direct opposition to the Internet goal of facilitating the flow of information between clients and servers. Security administrators are constantly fighting a battle to protect desktop systems and servers by controlling the flow of data (using fairly crude tools), while users clamor for unbridled access to Internet resources. The best a security administrator can hope for is to implement a packet-filtering policy that satisfies most user demands while minimizing attack opportunities. This tug of war has become worse with the advent of the Web and Java. The Java model calls for loading software from servers into users' computers for local execution, rather than transmitting data for display by a browser and so-called helper applications. In a home environment, if the typical citizen makes use of the same operating system and many of the same applications and is assumed to be even less technically sophisticated that his or her office counterpart, there is even less likelihood that he or she will be able to manage the system in a secure fashion. Moreover, since the system may connect directly to a wide range of Internet servers without the benefit of an intervening firewall managed by a security administrator, the opportunities for successfully attacking such computers are almost boundless. The network computer (NC) model transforms the problem but does not solve it. Proponents of low-cost NCs describe simple systems without local disk storage and with a minimal operating system (e.g., similar to a Web browser). Applications are downloaded onto the NC over the net, for local execution, via high-speed connections. Historically, one of the most difficult security problems to address is one in which potentially hostile software is imported into a target machine and executed. The "confinement problem" refers to this situation, where the imported software is supposed to be constrained in its access to user data, being granted only the access necessary to perform its advertised task. A Trojan horse is malicious imported software that performs some apparently useful function but also executes some sort of attack on the target system (e.g., destroying data or acquiring data for the attacker). In conventional systems the first challenge for an attacker using a Trojan

OCR for page 307
Page 368 horse was the problem of introducing his or her software into the target. If stealing data is the goal of the attack, the second problem faced by the attacker is one of exfiltration (i.e., sending the data back to the attacker). In an Internet environment, especially in the context of Web use, this second challenge essentially disappears. Confinement, if successfully implemented, addresses Trojan horse attacks by limiting the data and system resources available to the imported software. In the past a security-savvy user would never import software into his or her system from other than well-known sources (e.g., major vendors). The introduction of "shareware" and "freeware" into a desktop computer, distributed over the Internet or downloaded from an on-line service such as AOL or CompuServe, flies in the face of this traditional security convention. Yet software distributed in this fashion has become quite popular and is widely used in corporate as well as home computer environments. Functionality has won out over prudent security practice, even though examples of Trojan horse attacks via these software sources are not unknown. The Java model takes the imported software notion to its ultimate conclusion; it creates a legitimate path for infiltration of software (Java "applets") into user computers, typically via the Internet and the Web. To make this potentially dangerous situation less so, Java applets are supposed to be constrained in terms of the operations they can perform in the client computer. For example, applets are not supposed to have access to the local file system, to read or write user files. Unfortunately, vulnerabilities in the initial implementations of Java interpreters have not successfully confined applets, as promised. Even if these Java security problems are fixed, it is not clear that this simple model of highly constrained applet behavior will persist. Historically, useful applications have required access to user files, both for reading and writing. If applets are to become powerful tools performing increasingly sophisticated tasks for users, it seems unlikely that this stringent constraint will remain. Thus, one should assume that applets will, in the future, be granted access to user data, whether the data are locally resident on a full-fledged computer or stored on some network file server. Functionality almost always wins out over security. An even more serious concern is that the user of a Java-enabled Web browser (or of a network computer in the future) may not even know when applets are being loaded into his or her computer. Today, many corporate security administrators urge users to disable Java support in their Web browsers to minimize the potential for this sort of security problem. While use of Java is still rather minimal on the Internet, in time many Web pages may become Java applets, and disabling Java may prevent access to so many sites that users are forced to permit Java execution. So

OCR for page 307
Page 369 even if the user interface were to alert the user when an applet is down-loaded, how would a user know whether that event posed a danger? In principle, if Java environments evolve to a point where the known vulnerabilities are successfully addressed, the user could control which applets were loaded and what operations they were authorized to perform. But is this a realistic expectation? This represents the critical security user interface issue for the NII. Previous research on computer security showed that it was quite difficult to establish a constrained execution environment for imported software (i.e., to address the confinement problem). Few operating systems were successfully developed to meet this challenge, and very few are deployed today. Java does help address this problem by providing an interpreted environment, and thus it should be possible to remove many means by which imported software might try to circumvent the constraints imposed by the user. However, so far the Java environment has proven to be vulnerable to circumvention by interpreted code, just as security controls in traditional operating systems have proven vulnerable to circumvention by compiled code. In those operating systems that have attempted to solve the confinement problem, a user interface capable of administering the fine-grained access control required for confinement has been very complex. Compartmented mode workstations represent the most widely deployed systems that offer some form of operating-system-enforced confinement. These Unix-based systems are exceedingly difficult to administer, and the granularity of confinement offered is relatively crude compared to what a corporate or home user might require for controlled execution of applets. To date, there is no indication of how to structure a user interface to make confinement of imported software generally understandable even to fairly sophisticated computer users. Related Problems And Research Directions As noted above, confinement is a hard computer security problem that has been studied for almost 20 years. To protect users against malicious imported software (e.g., applets), it will be necessary to offer fine-grained access control to confine the execution of this software. This problem may be easier to solve in a more restricted environment such as that envisioned for network computers, but it is still a hard problem. Thus, the first research problem is the development of an interface that is intelligible to users to empower them to manage confinement for imported software. Many of the forms of interaction alluded to above require authentication in order to provide security. The user must be authenticated to the server, and the server must be authenticated to the user. The current

OCR for page 307
Page 370 proposals for how to accomplish this requirement center around the use of public key cryptography and certificates (e.g., for validating digital signatures and for exchanging keys). However, we have no experience with public certification systems of the scale required to support all of the citizens of the United States, and all of the commercial and government service providers. In small-scale trials of certification systems for applications such as e-mail, one of the problems that quickly becomes apparent is the difficulty of presenting the right amount of authentication information to the user. If one displays full certification path data, the user will almost certainly be overwhelmed. If the only data displayed are from the final certificate in a path, suitable constraints must be imposed on the certification path validation algorithm to prevent ''surprises." However, configuring and managing certification path validation parameters appears to be a fairly complex task in the general case. Noting that the average citizen cannot program a video cassette recorder successfully, it is hard to imagine how this individual could manage a more complex certification validation system. This problems shows up in many ways. The Java architecture calls for applets to be digitally signed to verify their provenance (e.g., to detect modification of the applet after it was released by the developer or vendor). However, if it is hard to display the right level of detail to a user once the signature is validated, the signatures may not really address the fundamental problem of provenance. Authenticating the identity of a server to which the user has connected poses a similar problem. Small variations in the spelling of a server's name embedded in a public key certificate could easily lead a user to believe that he or she was connected to one (legitimate) server when another had actually been contacted. This analysis suggests that a critical area requiring additional research is how to provide a user interface to manage certification graphs, so that users are truly aware of the identities of the people and organizations with whom they are dealing in cyberspace. Many proposals for securing transactions generated by a user rely on the user employing a private digital signature key to sign the transaction. However, the user never directly sees the data being signed; he or she relies on software on his or her computer to indicate what is being signed. Thus, a user's signature may be applied to transactions or messages other than the ones he or she intends if malicious software manipulates the user interface. Several proposals call for citizens to make use of kiosks for some transactions, both with government and commercial entities. This seems especially attractive as a means to empower low-income households that might not otherwise have access to services via a home computer. However,

OCR for page 307
Page 371 not long ago, criminals managed to place a fake ATM (automated teller machine) in a shopping mall as a means of acquiring user bank account numbers and PINs (personal identification numbers). A higher-tech version of this scam could be effected with kiosks and might be much harder to detect. The fake kiosks could provide access to legitimate servers on behalf of the user, completing valid transactions on his or her behalf. However, without the user's permission, kiosks could also effect unauthorized transactions at the same time (e.g., applying the user's signature to unauthorized transactions for money transfers). A final research problem area is how to provide a user interface for personal cryptographic tokens so that users will be protected from malicious software that will attempt to misapply a user's digital signature capability.

OCR for page 307
Page 372 Research To Support Widespread Access To Digital Libraries And Government Information And Services Ben Shneiderman University of Maryland The rapid growth of the World Wide Web provides compelling testimony to the impact of improved user interfaces. Although FTP (file transfer protocol), Gopher, WAIS (wide area information service), and other services produced active usage, it was the appearance of easy-to-use embedded menu items and appealing graphics that produced the current intensity of use. Public interest continues to grow dramatically, and national policy is being effected in terms of providing access to government information and services. Early adopters, who are typically technologically sophisticated, are highly motivated to overcome poor designs and push beyond the difficulties to achieve their objectives. However, the much larger number of middle and late adopters are less likely to tolerate chaotic screens; unnecessarily lengthy paths; slow response times; inconsistent terminology; awkward instructions; inadequate help facilities; and missing, wrong, or out-of-date information. A proactive approach can ensure that the emerging technology will provide accessible, comprehensible, predictable interfaces that serve the needs of the majority. A prompt and moderate level of research effort can shape the evolution of user interfaces to match the skills, needs, and orientation of the broadest users. Topics might include the following: • Cognitive design strategies for information-abundant Web sites, including metaphor choice (library, shopping mall, television channels, etc.), navigation design, and visual overviews; • Recognition and support for the distinct needs of diverse user communities, such as elderly, young, handicapped, lower-income, minority, and rural users, plus those with poor reading skills; • Control panels to allow user tailoring to individual abilities, limitations, and technology; • Strategies to cope with efficient construction and maintenance of text and graphic versions, multiple browser support, varied user display devices, and voice output;

OCR for page 307
Page 373 • Empirical studies of high- versus low-fanout strategies (shallow versus deep trees), compact vertical design to reduce scrolling, benefits of reduced/increased graphical treatments, and impact of slow response time; • Web site construction languages and templates, software tools to verify visual and textual consistency, Web site management and terminology control, and thesaurus construction; • Sequencing, clustering, and emphasis of information items according to designer goals; • Web-oriented user interface design to support browsing directories, searching for key phrases in document databases, and performing database searches; • Design strategies to support evolutionary learning of complex sites and services; • Easy-to-use facilities to permit user construction of informational Web sites, community services, and entrepreneurial initiatives; • Low-cost computing devices and low-cost network access the "Web-top computer"; • Refined feedback and evaluation methods to guide designers, including usability testing, expert reviews, field trials, interviewing users, focus groups, e-mail surveys, and e-mail suggestion boxes; • Simple privacy protection and secure transmission of financial, medical, or other data; • Image compression methods to reduce file sizes while best preserving image detail, texture, and color richness; and • Logging and monitoring software, visualization of usage patterns for individuals and aggregates, and cost-benefit analyses. Coordination with relevant groups can avoid redundant efforts and support common goals. Current activities include the following: • Library of Congress National Digital Library Program, in cooperation with the University of Maryland (ben@cs.umd.edu); • National Research Council project on ordinary-citizen interfaces (Alan Biermann, Chair, Duke University, awb@cs.duke.edu)[the project reported on in this volume]. • Stanford University effort to coordinate database services (contact: Hector Garcia-Molina, hector@cs.stanford.edu); • U.S. government efforts such as GILS (Government Information Locator Service); • USACM project: The Interface Between Policy and Technology in Providing Public Access to Government Data (contact: Randy Bush, randy@psg.com);

OCR for page 307
Page 374 • Joint effort on digital libraries by the National Science Foundation, National Aeronautics and Space Administration, and Defense Advanced Research Projects Agency; and • International efforts (e.g., Canada, Singapore, Italy).