Simple voice communication with machines is now deployed in personal computers, in the automation of long-distance calls, and in voice dialing of mobile telephones. These systems have small vocabularies and strictly circumscribed task domains. In research laboratories there are advanced human-machine dialogue systems with vocabularies of thousands of words and intelligence to carry on a conversation on specific topics. Despite these successes, it is clear that the truly intelligent systems envisioned in science fiction are still far in the future, given the state of the art today.
Human-machine dialogue systems can be represented as a fourtep process, as shown in Figure 1. This figure encompasses both the simple systems deployed today and the spoken language understanding we envision for the future. First, a speech recognizer transcribes sentences spoken by a person into written text (Makhoul and Schwartz, in this volume; Rabiner and Juang, 1993). Second, a language understanding module extracts the meaning from the text (Bates, in this volume; Moore, in this volume). Third, a computer (consisting of a processor and a database) performs some action based on the meaning of what was said. Fourth, the person receives feedback from the computer in the form of a voice created by a speech synthesizer (Allen, in this volume; Carlson, in this volume). The boundaries between these stages of a dialogue system may not be distinct in practice. For instance, language-understanding modules may have to cope with errors in the text from the speech recognizer, and the speech recognizer may make use of grammar and semantic constraints from the language module in order to reduce recognition errors.
In the 1993 "Colloquium on Human-Machine Communication by Voice," sponsored by the National Academy of Sciences (NAS), much of the discussion focused on practical difficulties in building and deploying systems for carrying on voice dialogues between humans and machines. Deployment of systems for human-to-machine communication by voice requires solutions to many types of problems