ties are occupied, unavailable, or not usable by the human (e.g., for users with visual or motor disabilities). One motivation for human-computer interaction by voice is that voice interfaces are considered "more natural" than other types of interfaces (e.g., keyboard, mouse, touch screen). That is, speech interfaces can provide a "look and feel" that is more like communication between humans. The underlying assumption is that by presenting this ''more natural" interface to the user the system can take advantage of skills and expectations that the user has developed through everyday communicative experiences to create a more efficient and effective transfer of information between human and machine (Leiser, 1989).
A successful human-machine interaction, like a successful human-human interaction, is one that accomplishes the task at hand efficiently and easily from the human's perspective. However, current human-computer voice-based interactions do not yet match the richness, complexity, accuracy, or reliability achieved in most human-human interactions either for speech input [i.e., automatic speech recognition (ASR) or speech understanding] or for speech output (digitized or synthetic speech). This deficit is due only in part to imperfect speech technology. Equally important is the fact that, while current automated systems may contain sufficient domain knowledge about an application, they do not sufficiently incorporate other kinds of knowledge that facilitate collaborative interactions. Typically, an automated system is limited both in linguistic and conceptual knowledge. Furthermore, automated systems using voice interfaces also have an impoverished appreciation of conversational dynamics, including the use of prosodic cues to appropriately maintain turn taking and the use of confirmation protocols to establish coherence between the participants.
A well-designed voice interface can alleviate the effects of these deficiencies by structuring the interaction to maximize the probability of successfully accomplishing the task. Where technological limitations prohibit the use of natural conversational speech, the primary role of the interface is to induce the user to modify his/her behavior to fit the requirements of the technology. As voice technologies become capable of dealing with more natural input, the user interface will still be critical for facilitating the smooth flow of information between the user and the system by providing appropriate conversational cues and feedback. Well-designed user interfaces are essential to successful applications; a poor user interface can render a system unusable.
Designing an effective user interface for a voice application involves consideration of (a) the information requirements of the task,