Cover Image


View/Hide Left Panel

Page 481

of the HMM approach. The discriminant function approach achieves higher performance by using a criterion that minimizes directly the errors due to misclassification. In speech synthesis, articulatory models and automatic methods for determining their parameters offer the best hope of providing the needed flexibility and naturalness in synthesizing a wide range of speech materials.


Acero. A, and R. M. Stern, "Environmental robustness in automatic speech recognition," Proc. ICASSP-90, pp. 849-852, Albuquerque, NM, 1990.

Atal, B. S., "Efficient coding of LPC parameters by temporal decomposition," Proceedings of the International Conference IEEE ASSP, Boston, pp. 81-84, 1983.

Atal, B. S., "From speech to sounds: Coping with acoustic variabilities," Towards Robustness in Speech Recognition, Wayne A. Lea (ed.), pp. 209-220, Speech Science Publications, Apple Valley, Minn., 1989.

Cheng, Y. M., and D. O'Shaughnessy, "Short-term  temporal decomposition and its properties for speech compression," IEEE Trans. Signal Process., vol. 39, pp. 12821290, 1991.

Cheng, Y. M., and D. O'Shaughnessy, "On 450-600 b/s natural sounding speech coding," IEEE Trans. Speech Audio Process., vol. 1, pp. 207-220, 1993.

Daubechies, I., "The wavelet transform, time-frequency localization and signal analysis," IEEE Trans. Inf. Theory, vol. 36, pp. 961-1005, Sept. 1990.

Dautrich, B. A., L. R. Rabiner, and T. B. Martin, "On the effects of varying filter bank parameters on isolated word recognition," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-31, pp. 793-806, Aug. 1983.

Furui, S., "On the role of spectral transitions for speech perception," J. Acoust. Soc. Am., vol. 80, pp. 1016-1025, Oct. 1986.

Ghitza, O., "Auditory nerve representation as a basis for speech processing," Advances in Speech Signal Processing, S. Furui and M. M. Sondhi (eds.), pp. 453485, Marcel Dekker, New York, 1992.

Herley, C., et al., "Time-varying orthonormal tilings of the time-frequency plane," IEEE Trans. Signal Process., Dec. 1993.

Hlawatsch, F., and G. F. Boudreaux-Bartels, "Linear and quadratic time-frequency signal representations," IEEE Signal Process. Mag., pp. 21-67, Apr. 1992.

Juang, B. H., "Speech recognition in adverse environments," Comput. Speech Lang., vol. 5, pp. 275-294, 1991.

Juang, B. H., and S. Katagiri, "Discriminative learning for minimum error classification," IEEE Trans. Signal Process., vol. 40, pp. 3043-3054, Dec. 1992.

Juang, B. H., and L. R. Rabiner, "Hidden Markov models for speech recognition," Technometrics, vol. 33, pp. 251-272, Aug. 1991.

Miller, G. A., G. A. Heise, and W. Lichten, "The intelligibility of speech as a function of the context of the test materials," J. Exp. Psychol., vol. 41, pp. 329-335, 1961.

Rabiner, L. R., and B. H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., 1993.

Rioul, O., and M. Vetterli, "Wavelets and signal processing," IEEE Signal Process. Mag., pp. 14-38, Oct. 1991.

Schroeter, J., and M. M. Sondhi, "Speech coding based on physiological models of speech production," Advances in Speech Signal Processing, S. Furui and M. M. Sondhi (eds.), pp. 231-267, Marcel Dekker, New York, 1992.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement