Skip to main content

Currently Skimming:

Training and Search Methods for Speech Recognition
Pages 199-214

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 199...
... The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizes wishes to find the most probable utterance that could have caused the observed acoustic index string.
From page 200...
... The model of speech production of a sequence of words W is a concatenation of models of individual words wi making up the sequence W (see Figure 2~. We recall that the HMM of Figure 1 starts its operation in the initial state so and ends it when the final state SF iS reached.
From page 201...
... Essentially without loss of generality we will consider only a finite alphabet ~ of equivalence classes, so that O(W1' W2, · · · Wi_~) = 0i £ ~ A popular classifier example is: (the bigram model)
From page 202...
... There are three states and six transitions, t3 being a null transition. The transition probabilities satisfy constraints FIGURE3 Sample~ree- Am \ / /r t4 state hidden Markov t - model.
From page 203...
... The states of successive stages are connected by the nonnull transitions of the HMM because they take a unit of time to complete. The states within a stage are connected by the null transitions because these are accomplished instantaneously.
From page 204...
... Then it would seem intuitively reasonable to define the "counts" by (t' is restricted to nonnull transitions)
From page 205...
... , the counter of exactly one of the nonnull transitions is increased by 1 for each output, this same contribution is simply distributed by (8') among the various counters belonging to nonnull transitions.
From page 206...
... and M (s) are the sets of null and nonnull transitions leaving s, respectively: M¢s)
From page 207...
... As long as the text HMM contains a sufficient number of each of the elementary HMMs, the resulting estimation of the Pitt and Pa I t) parameters will be successful, and the HMMs 1i .' ,~ Pl~Ul<~; ~ Elementary hidden Markov model for the fenonic case.
From page 208...
... Consider the following basic problem: Given a fully specified HMM, find the sequence of transitions that were the most likely "cause" of given observed data A To solve the problem, we again use the trellis representation of the HMM output process (see Figure 4~.
From page 209...
... This statement must be interpreted literally we do not care what happens inside the HMM boxes. The slightly more complicated bigram language model [see Eq.
From page 210...
... ~ 1 , 1 ~/ 1 1 ~/ 1 Or ~ ~/ ~ P(VN) ~/ \ HMM _ ~ _ of VN / / / / 1 FIGURE 6 Schematic structure of the hidden Markov model for a unigram language model.
From page 211...
... \ \ \_ \ rx \ ' ~ \ HMM of _ _ >. _ _ _ VN FIGURE 7 Schematic structure of the hidden Markov model for a bigram language model.
From page 212...
... trellis (see the third section of this paper) corresponding to Figure 7.
From page 213...
... The trace-back (see fourth section) from any of these final states, say, of word v at stage j, will lead to the initial state of the same word model, say at stage i.
From page 214...
... Paul, D B., "An Essential As Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model," 1992 International Conference on Acoustics, Speech, and Signal Processing, San Francisco, March 1992.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.