National Academy of Sciences | 150 Year Anniversary

Questions? Call 800-624-6242

| Items in cart [0]

The National Academies Press

HARDBACK
price:$89.95
add to cart

Rights & Permissions

topleft topright

Voice Communication Between Humans and Machines (1994)
National Academy of Sciences (NAS)

Citation Manager

. "Computer Speech Synthesis: Its Status and Prospects." Voice Communication Between Humans and Machines. Washington, DC: The National Academies Press, 1994.

Please select a format:

BibTeX EndNote RefMan


Page
108
bottomleft bottomright

The following HTML text is provided to enhance online readability. Many aspects of typography translate only awkwardly to HTML. Please use the page image as the authoritative form to ensure accuracy.


Page 108

consoles no longer exist. Also, speech synthesis has improved a great deal. The best systems—of which the current Bell Labs system is surely an example—are entirely intelligible, not only to their creators but also to the general population, and sometimes they even sound rather natural. Here I will give a personal view of where this technology stands today and where it seems to be headed. This assessment distills contributions from participants in the colloquium presentations and discussion, who deserve the credit for any useful insights. Omissions, mistakes, and false predictions are of course my own.

The ongoing microelectronics revolution has created a striking opportunity for speech synthesis technology. The computer whose console was mentioned earlier could not do real-time synthesis, even though it filled most of a room and cost hundreds of thousands of dollars. Almost every personal computer now made is big and powerful enough to run high-quality speech synthesis in real time, and an increasing number of such machines now come with built-in audio output. Small battery-powered devices can offer the same facility, and multiple channels can be added cheaply to telecommunications equipment.

Why then is the market still so small? Partly, of course, because the software infrastructure has not yet caught up with the hardware. Just as widespread use of graphical user interfaces in applications software had to wait for the proliferation of machines with appropriate system-level support, so widespread use of speech synthesis by applications will depend on common availability of platforms offering synthesis as a standard feature. However, we have to recognize that there are also remaining problems of quality. Today's synthetic speech is good enough to support a wide range of applications, but it is still not enough like natural human speech for the truly universal usage that it ought to have.

If there are also real prospects for significant improvement in synthesis quality, we should consider a redoubled research effort, especially in the United States, where the current level of research in synthesis is low compared to Europe and Japan. Although there is excellent synthesis research in several United States industrial labs, there has been essentially no government-supported synthesis research in the United Sates for some time. This is in sharp contrast to the situation in speech recognition, where the ARPA's Human Language Technology Program has made great strides, and also in contrast to the situation in Europe and Japan. In Europe there have been several national and European Community-level programs with a focus on synthesis, and in Japan the ATR Interpreting Telephony Laboratory has made significant investments in synthesis as well as recognition

Page
108
Front Matter (R1-R10)
Dedication (1-4)
Voice Communication Between Humans and Machines--An Introduction (5-12)
Scientific Bases of Human-Machine Communication by Voice (13-14)
Scientific Bases of Human-Machine Communication by Voice (15-33)
The Role of Voice in Human-Machine Communication (34-75)
Speech Communication -- An Overview (76-104)
Speech Synthesis Technology (105-106)
Computer Speech Synthesis: Its Status and Prospects (107-115)
Models of Speech Synthesis (116-134)
Linguistic Aspects of Speech Synthesis (135-156)
Speech Recognition Technology (157-158)
Speech Recognition Technology: A Critique (159-164)
State of the Art in Continuous Speech Recognition (165-198)
Training and Search Methods for Speech Recognition (199-214)
Natural Language Understanding Technology (215-216)
The Roles of Language Processing in a Spoken Language Interface (217-237)
Models of Natural Language Understanding (238-253)
Integration of Speech with Natural Language Understanding (254-272)
Applications of Voice-Processing Technology I (273-274)
A Perspective on Early Commercial Applications of Voice-Processing Technology for Telecommunications and Aids for the Handicapped (275-279)
Applications of Voice-Processing Technology in Telecommunications (280-310)
Speech Processing for Physical and Sensory Disabilities (311-344)
Applications of Voice-Processing Technology II (345-346)
Commercial Applications of Speech Interface Technology: An Industry at the Threshold (347-356)
Military and Government Applications of Human-Machine Communication by Voice (357-370)
Technology Deployment (371-372)
Deployment of Human-Machine Dialogue Systems (373-389)
What Does Voice-Processing Technology Support Today? (390-421)
User Interfaces for Voice Applications (422-442)
Technology in 2001 (443-444)
Speech Technology in the Year 2001 (445-449)
Toward the Ultimate Synthesis/Recognition System (450-466)
Speech Technology in 2001: New Research Directions (467-481)
New Trends in Natural Language Processing: Statistical Natural Language Processing (482-504)
The Future of Voice-Processing Technology in the World of Computers and Communications (505-514)
Author Biographies (515-524)
Index (525-548)