Cover Image

HARDBACK
$89.95



View/Hide Left Panel

Page 214

fraction of states of the trellis constructed directly for the trigram language model (cf. Figure 8).

Consequently, to find  we conduct two beam  searches. The first, on the bigram HMM, results in word presence time intervals. These give rise to a trigram HMM over which the second beam search is conducted for the final W.

REFERENCES

Bahl, L. R., F. Jelinek, and R. L. Mercer, ''A maximum likelihood approach to continuous speech recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, pp. 179-190, March 1983.

Bahl, L., P. Brown, P. de Souza, R. Mercer, and M. Picheny, "Acoustic Markov Models used in the Tangora speech recognition system," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, New York, April 1988.

Bahl, L. R., P. S. Gopalakrishnan, and R. L. Mercer, "Search Issues in Large Vocabulary Speech Recognition," Proceedings of the IEEE Workshop on Automatic Speech Recognition, Snowbird, Utah, 1993.

Baum, L., "An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process," Inequalities, vol. 3, pp. 1-8, 1972.

Jelinek, F., "A fast sequential decoding algorithm using a stack," IBM Journal of Research Development, vol. 13, pp. 675-685, Nov. 1969.

Lowerre, B. T., "The Harpy Speech Recognition System," Ph.D. Dissertation, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, Pa., 1976.

Murveit, H., J. Butzberger, V. Digalakis, and M. Weintraub, "Large-Vocabulary Dictation Using SRI's Decipher Speech Recognition System: Progressive Search Techniques," Spoken Language Systems Technology Workshop, Massachusetts Institute of Technology, Cambridge, Mass., January 1993.

Paul, D. B., "An Essential A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model," 1992 International Conference on Acoustics, Speech, and Signal Processing, San Francisco, March 1992.

Rabiner, L. R., and B. H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., 1993, pp. 378-384.

Viterbi, A. J., "Error bounds for convolutional codes and an asymmetrically optimum decoding algorithm," IEEE Transactions on Information Theory, vol. IT-13, pp. 260-267, 1967.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement