Skip to main content

Currently Skimming:

Applications of Voice-Processing Technology in Telecommunications
Pages 280-310

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 280...
... Two of these, automatic speech recognition and text-to-speech synthesis, will provide users with more freedom on when, where, and how they access information. While these technologies are currently in their infancy, their capabilities are rapidly increasing and their deployment in today's telephone network is expanding.
From page 281...
... In this paper I present a vision of the voice-processing industry with a focus on the areas with the broadest base of user penetration: speech recognition, text-to-speech synthesis, natural language processing, and speaker recognition technologies. The current and future applications of these technologies in the telecommunications industry will be examined in terms of their strengths, limitations, and the degree to which user needs have been or have yet to be met.
From page 282...
... Current applications using speech recognition and text-to-speech synthesis technologies center around two areas: those that provide cost reduction [e.g., AT&T and Bell Northern Research's (BNR) automation of some operator functions and NYNEX and BNR's attempt to automate portions of directory assistance]
From page 283...
... NYNEX's directory assistance call completion service, BNR's stock quotation service, and Nippon Telegraph & Telephone's (NTT) banking by phone service]
From page 284...
... In this paper, I present a vision of the voice-processing industry, with a focus on the areas with the broadest base of user penetration: speech recognition, text-to-speech synthesis, natural language processing, and speaker recognition technologies. Current and future applications of these technologies in the telecommunications indus
From page 285...
... We are just beginning to understand how to incorporate natural language processing into the speech recognition world so that the meaning of a user's speech can be extracted. This research is in its infancy and may require more than a decade of work before viable solutions can be found, developed, and deployed (Hirschman et al., 1992; Marcus, 1992; Proceedings of the DARPA Speech and Natural Language Workshop, 1993~.
From page 286...
... THE ART OF SPEECH RECOGNITION AND SYNTHESIS Current speech recognition and text-to-speech synthesis practices encompass engineering art as well as scientific knowledge. Fundamental knowledge of speech and basic principles of pattern matching have been essential to the success of speech recognition over the past 25 years.
From page 287...
... "Barge in" provides a necessary, easy-to-use capability for customers and, as with wordspotting, is an essential technology for successful mass deployment of ASR into the telephone network (AT&T Conversant Systems, 1991~. · Rejection.
From page 288...
... In general, proper names do not follow the same prescribed rules for pronunciation as do other words. However, one of the major applications for TTS technology is to say people's names (e.g., directory assistance applications)
From page 289...
... That said, there are some general questions that must be asked when considering an application using current speech technologies. The answers will help determine whether it is advisable or possible to design a quality application using speech technology.
From page 290...
... Figure 4 graphically shows the main application areas for speech recognition, speaker recognition, natural language processing, and text-to-speech synthesis currently considered industry-wide. The figure shows that most of the broad application areas center around speech recognition, such as for menu-based transactions or for information access.
From page 291...
... automation of operator services, currently being deployed by many telephone companies, including AT&T, Northern Telecom, Ameritech, and Bell Atlantic; (2) automation of directory assistance, currently being trialed by NYNEX and Northern Telecom; and (3)
From page 292...
... Small-vocabulary, wordspotting, barge-in Small-vocabulary, wordspotting, barge-in Rotary telephone replacement Automated access to enhancement telephone features
From page 293...
... After extensive field trials in Dallas, Seattle, and Jacksonville during 1991 and 1992, AT&T announced that it would begin deploying VRCP (Voice Recognition Call Processing)
From page 294...
... Please say yes if you accept the charges, no if you refuse the charges, or operator if you need assistance, now." "Thank you, for using AT&T." collect call please." "Yes, I will." These trials were considered successful not just from a technological point of view but also because customers were willing to use the
From page 295...
... What differentiates the earlier BNR system from the AT&T system is the speech recognition technology. Analysis of the 1985 AT&T trials indicated that about 20 percent of user utterances contained not only the required command word but also extraneous sounds that ranged from background noise to groups of nonvocabulary words (e.g., "I want to make a collect call pleased.
From page 296...
... Voice Access to Information over the Telephone Network It has been over a decade since the first widespread use of automatic speech recognition in the telephone network was deployed. In 1981 NTT combined speech recognition and synthesis technologies in a telephone information system called Anser-Automatic Answer Network System for Electrical Requests (Nakatsu, 1990~.
From page 297...
... From a customer's standpoint, the cost of obtaining information about bank accounts is low (about the cost of a local telephone call)
From page 298...
... The targets of this application were conservatively set to accommodate the Spanish telephone network and its unfamiliar users. The speech recognizer deployed supports speaker-independent isolated word recognition with wordspotting and barge in of the Spanish key words uno, dos, and tres.
From page 299...
... People obtain information from computer databases by asking for what they want using their telephone, not by talking with another person or typing commands at a computer keyboard. As this technology develops, the voice response industry will expand to include voice access to information services such as weather, traffic reports, news, and sports scores.
From page 300...
... lids is one example of the people have easier access to one another part of our vision. Obviously, current ASR technology is not advanced enough to handle such requests as, Please get me the pizza place on the corner of 1st and Main, I think it's called Mom's or Tom's.
From page 301...
... The services available through VIP and the associated voice commands are as follows: Service Call Forwarding Continuous Redial Last Call Return Call Rejection Caller ID Blocking Access to Messaging Services Temporary Deactivation of Call Waiting Voice Command Call Forwarding Redial Return Call Call Rejection Block ID Messages Cancel Call Waiting Based on a series of customer trials, the following results were obtained: . method.
From page 302...
... Reverse Directory Assistance Ameritech recently announced a service called Automated Customer Name and Address (ACNA)
From page 303...
... However, as stated earlier, ASR technology currently cannot support recognition of fluent spontaneous speech spoken by anyone on any topic.) The text is then transmitted to the hearing-impaired party's TDD unit.
From page 304...
... I have tried to counteract this tendency by carefully pointing out the limitations of current speech recognition and text-to-speech technologies while focusing on the types of applications that can be successfully deployed for mass user consumption given today's state of the art. Near-Term Technical Challenges While the prospect of having a machine that humans can converse with as fluently as they do with other humans remains the Holy Grail of speech technologists and one that we may not see realized for another generation or two, there are many critical technical problems that I believe we will see overcome in the next 2 to 5 years.
From page 305...
... While current wordspotting techniques do an excellent job of rejecting much of the out-ofvocabulary signals that are seen in today's applications, they are by no means perfect. Since AT&T announced that its wordspotting technology was available for small-vocabulary applications in its products and services beginning in 1991, many other ASR vendors have realized that the ability to distinguish key word from nonkey word signals is mandatory if mass deployment and acceptance of ASR technology are to occur.
From page 306...
... Thus, speech technologies will become necessary if we are to easily communicate with our personal communicators. Within the next 2 to 3 years I expect to see some rudimentary speech recognition technology incorporated into PCDs.
From page 307...
... In contrast to the past two decades, in which advances were made in feature analysis and pattern comparison, the coming decade will be the period in which computational linguistics makes a definitive contribution to "natural" voice interactions. The first manifestations of these better language models will be in restricted-domain applications for which specific semantic information is available, for example, an airline reservation task (Hirschman et al., 1992; Marcus, 1992; Pallet, 1991; Proceedings of the DARPA Speech and Natural Language Workshop, 1993~.
From page 308...
... C Schwab, Automated alternate billing services at Ameritech: Speech recognition performance and the human interface, Speech Tech.
From page 309...
... :35-41, August 1990. Lennig, M., Automated bilingual directory assistance trial in Bell Canada, in Proceedings of the 1st IEEE Workshop on Interactive Voice Technology for Telecommunications Applications, Piscataway, N.J., October 1992.
From page 310...
... R Mikkilineni, Isolated word recognition over the DDD telephone network-Results of two extensive field studies, in Proceedings of the IEEE ICASSP '88, pp.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.