| ||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 121
Page 121
4
Design And Evaluation
Designing any sort of computer-mediated device for ordinary
people for effective and pleasant everyday use has proven to be
surprisingly difficult. The evidence for this observation comes
from the myriad problems cited above in this report and at the
workshop organized by the steering committee, from systematic
empirical studies cited in this chapter, and from anecdotes
involving frequent complaints from ordinary people when they are
required to use the currently most common public-oriented
application-telephone-based voice response menu systems-as well as
from more sophisticated users of World Wide Web concerning the
complexities and frustrations that have led as many to abandon the
on-line life as to join it. (Consideration of the experiences and
needs of people without specific special needs, referred to here as
"ordinary" people, is an important complement to discussion of
those with special needs (see Chapter 2) for developing ideas for
research to support interfaces that work for more, if not most, of
the population.)
It is, of course, possible that the greater power, utility, and
desirability of computer-based functions as compared to traditional
mass-market technologies (e.g., television, telephony) mean that
greater difficulty of use is inevitable, worth a high price in
human effort and inconvenience, and solvable only by increased
education with its concomitant risk of leaving out those with
insufficient time, resources, or ability. However, an alternate
view is that it should be possible to use the power of the new
technologies not only to do more and better things but also to do
most of them at least as, or more, easily. Much of the burden of
introducing new
OCR for page 122
Page 122
information technologies to the public can be removed or
relieved by better design of the functions and interfaces with
which most people will deal.
The steering committee assumes that it is often or usually
possible to design more widely useful functions and to make them
easier to use through design activities specifically aimed at these
goals. Proof of the existence of this opportunity is readily
available, beginning with popular knowledge of such consumer
devices as cars and television sets, which were very complex
initially but became, from the user's perspective, less so through
sequences of adjustments over time. The Handbook of
Human-Computer Interaction (Helander, 1988) contains many
examples of prohibitively difficult systems made very much easier
and more effective by redesign, and many more recent examples are
reviewed by Nielsen (1993) and Landauer (1995). Some of these
successes are reviewed in more detail below in this chapter. To set
the stage, one is mentioned here that involves comparatively simple
store-and-forward (as opposed to more complex multimedia,
hypermedia, or collaboration support) technology-a case that has
particular relevance to much of the expected uses in the
every-citizen interface (ECI) environment.
Gould et al. (1987a) designed an electronic message system for
the 1984 Olympics in Los Angeles. The system was to be used by
athletes, coaches, families, and members of the press from all
corners of the globe. The original design was done by a very
experienced team at IBM's J.T. Watson Research Center. When first
tested in mock-up with representatives of its intended user
population, it was virtually impossible to operate effectively. By
the time an extensive program of iterative user testing and
redesign was finished, more than 250 changes in the interface, the
system-user dialogue, and the functionality were found to be
necessary or advantageous. The final system was widely used without
any special training by an extremely diverse population. Another
example comes from the digital libraries context and relates to the
Cypress on-line database of some 13,000 color images and associated
metadata from the Film Library of the California Department of
Water Resources (Van House, 1996). Iterative usability testing led
to improvements for two groups of users, a group from inside the
film library and a more diverse and less expert group of outsiders.
Both direct user suggestions and ideas based on observing users'
difficulties gave rise to design changes that were implemented
incrementally.
A central research challenge lies in better design and
evaluation for ordinary use by ordinary users and, more basically,
in how to accomplish these goals. The future is not out there to be
discovered: it has to be invented and designed. The scientific
challenge is to understand much better than we do now (1) why
computer use is difficult when it is, (2)
OCR for page 123
Page 123
how to design and ensure a design for easier and more effective
use, and (3) how to teach effectively both school children and
those past school age to take advantage of what there is to use (a
complex topic outside the scope of this report).
Available research and expert opinion point to at least three
reasons why many computer-mediated tools (including, especially,
communications systems) are currently difficult or ineffective for
use by a large part of the population: (1) complexity and power of
computer-mediated tools, (2) emphasis on users with unusual
abilities, and (3) sophistication of designers and their
discipline.
The Problem
Complexity and Power of Tools
Computer-mediated tools, as compared with traditional
technologies, can be extremely powerful and complex, doing a vast
array of different things with enormous speed. Of course, this is
their advantage and appeal, but it is also their temptation. It
means that a communications facility such as e-mail can be designed
not only to let a user send an asynchronous text message to another
subscriber but also to send multiple messages, create mailing
lists, respond automatically, forward, save, retrieve, edit, cut
and paste, attach attachments, create vacation messages, fax, and
so on. If the design is not handled extremely well, users will have
to learn how to negotiate this vast array of options, to know about
them and how to operate them if they want to use them, and at least
how to ignore them if they do not, and will always be required
somehow to choose whether and what. The situation can become
analogous to providing the cockpit control panel of an airliner for
use by its passengers to turn on their reading lights. The
consequences in computing range from the proliferation of features
in software products to observations that most amateur spreadsheets
contain serious errors, and that employee hand-holding costs as
much as hardware for business personal computer users, to
additional but seldom-used features on standard computer
keyboards.1 The concept of
multimodal interfaces that would accommodate alternative approaches
to input and/or output, discussed in Chapters 2 and 3, will
introduce considerable complexity into the technology development
process, without adding any new functional features.
Great power and complexity also bring the opportunity to make
very costly errors. Pressing the wrong key on an ordinary telephone
touch-tone pad leads at worst to a wrong number. With a
computer-mediated system it can, and often does, lead to hours of
lost work or inadvertently sending, for example, a "take me off
this mailing list" message to 300
OCR for page 124
OCR for page 125
OCR for page 126
OCR for page 127
Page 124
people, many of whom also wish they were not on it. Laboratory
studies have found repeatedly that the majority of user time spent
with popular applications such as word processors (which will be
incorporated into many ECI applications) or spreadsheets is
occupied with recovery from errors (see, for example, Card et al.,
1983). This is one of two reasons why computer-mediated activities
(see below for the second) create very much more variability in
task completion times than do traditional technologies (Egan,
1988). Contemporary discussions in the business and personal press
about the "futz factor"-extra time and effort to adjust various
aspects of a computer-based system-attest to continuing problems
resulting from increased complexity and power. The irony is that in
some cases (e.g., early cellular phones, personal computer
software), a significant amount of complexity appears to derive
from software and sometimes hardware added with the intention of
"enhancing" usability.2
Emphasis on Users with Unusual
Abilities
Computer-mediated tools emphasize individual differences in
ability more than do traditional technologies. Egan (1988) reviewed
a large number of studies of individual differences in the time
taken and errors made in using common computer applications. In
every case in which comparisons could be made, the variability
among different people was much greater when they used computers
rather than precomputer approaches to doing the same sorts of
operations. An approximate summary of the data from these studies
is that while most traditional tasks, such as operating a
conventional cash register, calculating a sum (manually), or
running around the block, will take about twice as long for the
slowest of 20 randomly chosen people than for the fastest, in
computer-mediated tasks the range is never that small; typically,
it is around 4 or 5 to 1, and may be as high as 20 to 1, even among
well-trained and experienced users such as professional
programmers.
In several instances a good portion of the greater
between-individual differences in computerized tasks has been
traced to measurable differences in cognitive abilities. In the
aggregate, workshop participants commented, such differences
contribute to observations about the concentration of computer use
among teenage males; they also contribute to reports in the
business press about the frustrations of "information overload."3 Egan (1988) and Landauer (1995)
reviewed studies in which measures of spatial memory, logical
reasoning, and verbal fluency, as well as age and interest in
mathematics and things mechanical, show greater than two-to-one
differences between the highest and lowest quarter of the sampled
potential user populations (see Figures 4.1 and 4.2 for examples).
The participants in the studies illustrated were mostly noncareer
middle-class
Page 125
suburban women with little or no computer experience, fairly
representative of the average citizens one might expect to be
future network users, although not of their range. How significant
a problem is this? One guess comes from studies of the efficiency
gains expected for computer applications to common business tasks.
From the sparse available data, Landauer (1995) estimated that
computer augmentation speeds work by 30 percent on average (with
large variations). Combining this with the individual-difference
estimates and a normal probability distribution suggests that about
a third of the population would usually be better off without
computer help as now provided because they do not possess the basic
abilities prerequisite to its effective use. This is without
consideration of the part of the population ordinarily designated
as disabled or disadvantaged.
While education and training can usually reduce individual
differences, there are two reasons why computer-mediated tasks may
be less susceptible to this solution. One is the aforementioned
vastly greater complexity usually offered-the much larger variety
of different functions available and alternate means for achieving
the same effects (e.g., five or more ways to cut and paste in most
recent text editors). This variety often means that it can take
longer to acquire high skill, akin on a smaller scale to the
greater difficulty of learning to fly a jetliner than to drive a
car. It also means, often, that some users will find better ways to
operate their system than will others, not because there are large
differences in which method serves which person best-such
''treatment-aptitude interactions," despite widespread folk belief
in their existence, have virtually never been found in carefully
controlled studies-but merely through chance variations in which
operations users learn first, make habitual, and thus allow to
become dominant over other possibilities that it thereafter takes
excessive time to find and retrain for.
The second reason that computer training is less helpful than
training for earlier technologies is the much more rapid and
challenging changes in the technology itself. The basic automobile,
typewriter, and telephone have not changed significantly from the
user's perspective in almost half a century, and changes from their
very beginning have been few, slow, relatively minor, and learnable
without help (nonuse of extra features, such as the clocks on video
cassette recorders or cruise control on cars, does not tend to be
associated with an inability to use these devices for their
essential functions). By contrast, every new model of a personal
computing software package, even from the same manufacturer, comes
with many new features and functions, new menu arrangements with
new labels, and a large instruction book (and built-in help
system). Such enhancements can affect even basic tasks. And every
few years another new computer-based technology is offered. Thus,
there is simply not the time available to consider yearlong high
school courses for each computer
Page 126
technology every citizen might want to use-this year e-mail,
next year the World Wide Web-as there was for typewriters and
accounting machines, or 7-year apprenticeships, as there were for
steam shovels and looms; the systems would be obsolete and gone
before expertise was gained. The result is that high-functionality
computational systems are never completely learned nor is their
power fully exploited, and the primary learning strategy is based
on learning on demand (Fischer, 1991). The challenge is to design
so as to exploit the potential power for ease of learning and use
as well as for increased functionality. Discontent with
proliferating features contributed to mid-1990s experiments with
so-called network computers, with fewer features than conventional
personal computers, as well as to periodic articles in the business
press about the persistently high costs of owning and using
personal computers.4
Several members of the steering committee and reviewers of a
draft of this report wondered whether the low-efficiency gains and
large individual differences found in studies in the 1980s may have
been overcome by technological advances in the 1990s. Although
market statistics attest to growing use of information
technologies, the sparse empirical evaluations of these issues in
earlier periods appear to have become no more common in the past 5
years. While it was not possible to mount a systematic search for
empirical evidence on trends in usability, the consensus of the
usability engineers on the steering committee and among workshop
participants was that things have not in general improved: for the
most part, technological advances, particularly in software, have
increased complexity, and, while some vendors are doing more
usability testing, increased competition to be first to market with
new features has brought a growing tendency to omit the kinds of
early and iterative design and evaluation activities these experts
think is essential to ensure ease of learning and use. In addition,
what testing is done often generates results that vendors hold
closely in the interests of gaining or preserving proprietary
competitive advantages.
Market forces alone are unlikely to yield interfaces for every
citizen because the rapid pace of the commercial market fosters an
emphasis on sales performance as an indirect measure of value or
effectiveness rather than direct presale evaluation of interface
quality. At the workshop, Dennis Egan suggested several reasons for
the lack of attention to interface evaluation by the industry.
First, industrial research groups have reoriented themselves toward
identifying near-term profit-making products and services, not
performing longer-term research to evaluate new interface concepts
and technologies and usually not publishing helpfully detailed
results when they do. Second, the acceleration of product life
cycles-particularly software-leaves little time for interface
evaluation studies. Third, information technology products may
succeed despite
Page 127
having inferior user interfaces by supporting highly desired
functions, reaching the market before their competition and
becoming de facto standards. Workshop participants from industry
and report reviewers emphasized the commercial dependence on
marketplace Darwinism, noting that vendors seem to find fielding
their best guesses in products more cost effective than added
precommercialization testing. Some went further to suggest that the
World Wide Web had provided a mechanism for harnessing market
cycles, noting that some vendors are using Web sites for beta
testing of products and for eliciting feedback from those users
(mostly sophisticated and eager "early adopters" unrepresentative
of the average citizen) who opt to try the products. The constant
release of beta versions of software over the Web represents a
limited kind of software evaluation and user involvement on a
massive scale; some of these releases are now reviewed, sometimes
even on the basis of modest empirical tests, in trade publications,
and usability and other design experts are dedicated by some
vendors to some releases. Several companies are using this
mechanism for iterative design. However, work by Hartson et al.
(1996) suggests that methods for using "the Web as a usability lab"
effectively, while promising, are in their infancy and face a
number of problems that will be resolved only by considerable
research. For example, this approach will require significant
innovation in system instrumentation and user sampling techniques
because, as outlined later, the untutored opinions of programmers
and other power users are usually of little value for detecting the
functionality and usability problems that are important for
ordinary people (Nielsen, 1993). Tracking such efforts in
broad-based user involvement and assessing their effectiveness
might provide a productive starting place for research on
large-scale participatory design and evaluation methods.
Sophistication of Computer Hardware
and Software Designers and Their Discipline
Most of the people involved in the design and implementation of
functions and interfaces for computer applications are themselves
sophisticated computer users. Feature requests and inventions come
primarily from experienced users and are supplemented and
implemented by programmers. The situation is unlike that in other
consumer-oriented technologies in two important respects. As noted
earlier, computers offer and usually provide a larger range of
functions and controls and therefore almost always greater
complexity in the choices and actions required of the user. Hence,
expertise with a computer technology can often play a much greater
role in its use. Computer technology started as an aid for a highly
technical portion of the population-scientists, engineers, and