The concept of the degree of difficulty of a human-machine voice dialogue system can be used to evaluate its feasibility. The degree of difficulty of a particular application depends on many factors. Some are obvious, but others are easy to overlook. For example, the expertise of the users has a dramatic effect on the performance of these systems. Also, the willingness of users to overlook deficiencies in the system varies widely depending on whether there are other alternatives. A comprehensive view of all the dimensions of difficulty is needed in order to assess the overall degree of difficulty.
Deployment of voice transaction services is an iterative process. Because the machine must cope with errors made by the person, and the human being must cope with errors made by the machine, the nature of the transaction is difficult if not impossible to predict in advance. Though the ultimate goal is to create a machine that can adapt to the transaction as it gains more experience, the human-machine dialogue systems of today require engineering art as well as scientific principles.
Bahl, L. R., et al., "Large vocabulary natural language continuous speech recognition," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 465-468, Glasgow, Scotland, May 1989.
Berwick, R., "Intelligent natural language processing: Current trends and future prospects," pp. 156-183 in Al in the 1980's and beyond, W. E. Grimson and R. S. Patil, eds., MIT Press, Cambridge, Mass., 1987.
Gauvin, J., and C. H. Lee, "Improved acoustic modeling with Bayesian learning," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 481-484, San Francisco, 1992.
Hirst, G., Semantic Interpretation and the Resolution of Ambiguity, Cambridge University Press, Cambridge, England, 1987.
Lennig, M., D. Sharp, P. Kenny, V. Gupta, and K. Precoda, "Flexible vocabulary recognition of speech," in Proceedings of the 1992 International Conference on Spoken Language Processing, pp. 93-96, Banff, Alberta, Canada, Oct. 1992.
Marcus, M., ed., Proceedings, Speech and Natural Language Workshop, 1992, Harriman, New York, Morgan Kaufmann Publishers, San Mateo, Calif., Feb. 1992.
Rabiner, L., and B. H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., 1993.
Ramesh, P., et al., "Speaker independent recognition of spontaneously spoken connected digits," Speech Communication, Vol. 11, pp. 229-235, 1992.
Van Santen, J. P. H., "Perceptual experiments for diagnostic testing of text-to-speech systems," Computer Speech and Language, Vol. 7, No. 1, pp. 49-100, Jan. 1993.