National Academy of Sciences | 150 Year Anniversary

Questions? Call 800-624-6242

| Items in cart [0]

The National Academies Press

HARDBACK
price:$89.95
add to cart

Rights & Permissions

topleft topright

Voice Communication Between Humans and Machines (1994)
National Academy of Sciences (NAS)

Citation Manager

. "Speech Technology in 2001: New Research Directions." Voice Communication Between Humans and Machines. Washington, DC: The National Academies Press, 1994.

Please select a format:

BibTeX EndNote RefMan


Page
468
bottomleft bottomright

The following HTML text is provided to enhance online readability. Many aspects of typography translate only awkwardly to HTML. Please use the page image as the authoritative form to ensure accuracy.


Page 468

Rapid advances in very-large-scale integrated (VLSI) circuit capabilities are creating a revolution in the world of computers and communications. These advances are creating an increasing demand for sophisticated products and services that are easy to use. Automatic speech recognition and synthesis are considered to be the key technologies that will provide the easy-to-use interface to machines.

The past two decades of research have produced a stream of increasingly sophisticated solutions in speech recognition and synthesis (Rabiner and Juang, 1993). Despite this progress, the perception remains that the current technology is not flexible enough to allow easy voice communication with machines. This chapter reviews the present status of this important technology, including its limitations, and discusses the range of applications that can be supported by our present knowledge. But as we look into the future and ask which speech recognition and synthesis capabilities will be available about 10 years from now, it is important also to discuss the technical challenges we face in realizing our vision of the future and the directions in which new research should proceed to meet these challenges. We will examine these issues in this paper and take a critical look at the shortcomings of the current speech recognition and synthesis algorithms.

Much of the technical knowledge that supports the current speech-processing technology was created in a period when our ability to implement technical solutions on real-time hardware was limited. These limitations are quickly disappearing, and we look to a future at the end of this decade when a single VLSI chip will have a billion transistors to support much higher processing speeds and more ample storage than is now available.

The speech recognition and synthesis algorithms available at present work in limited scenarios. With the availability of fast processors and a large memory, tremendous opportunity exists to push speech recognition technology to a level where it can support a much wider range of applications. Speech databases with utterances recorded from many speakers in a variety of environments have been important in achieving the progress that has been realized so far. But on the negative side, these databases have encouraged speech researchers to rely on trial-and-error methods, leading to solutions that are narrow and that apply to specific applications but do not generalize to other situations. These methods, although fruitful in the early development of the technology, are now a hindrance as we become much more ambitious in seeking solutions to bigger problems. The time has come to set the next stage for the development of speech technology, and it is important to realize that a solid base of scientific understanding is

Page
468
Front Matter (R1-R10)
Dedication (1-4)
Voice Communication Between Humans and Machines--An Introduction (5-12)
Scientific Bases of Human-Machine Communication by Voice (13-14)
Scientific Bases of Human-Machine Communication by Voice (15-33)
The Role of Voice in Human-Machine Communication (34-75)
Speech Communication -- An Overview (76-104)
Speech Synthesis Technology (105-106)
Computer Speech Synthesis: Its Status and Prospects (107-115)
Models of Speech Synthesis (116-134)
Linguistic Aspects of Speech Synthesis (135-156)
Speech Recognition Technology (157-158)
Speech Recognition Technology: A Critique (159-164)
State of the Art in Continuous Speech Recognition (165-198)
Training and Search Methods for Speech Recognition (199-214)
Natural Language Understanding Technology (215-216)
The Roles of Language Processing in a Spoken Language Interface (217-237)
Models of Natural Language Understanding (238-253)
Integration of Speech with Natural Language Understanding (254-272)
Applications of Voice-Processing Technology I (273-274)
A Perspective on Early Commercial Applications of Voice-Processing Technology for Telecommunications and Aids for the Handicapped (275-279)
Applications of Voice-Processing Technology in Telecommunications (280-310)
Speech Processing for Physical and Sensory Disabilities (311-344)
Applications of Voice-Processing Technology II (345-346)
Commercial Applications of Speech Interface Technology: An Industry at the Threshold (347-356)
Military and Government Applications of Human-Machine Communication by Voice (357-370)
Technology Deployment (371-372)
Deployment of Human-Machine Dialogue Systems (373-389)
What Does Voice-Processing Technology Support Today? (390-421)
User Interfaces for Voice Applications (422-442)
Technology in 2001 (443-444)
Speech Technology in the Year 2001 (445-449)
Toward the Ultimate Synthesis/Recognition System (450-466)
Speech Technology in 2001: New Research Directions (467-481)
New Trends in Natural Language Processing: Statistical Natural Language Processing (482-504)
The Future of Voice-Processing Technology in the World of Computers and Communications (505-514)
Author Biographies (515-524)
Index (525-548)