Sammendrag
This paper addresses the problem of generating lexical word representations that properly represent natural pronunciation variations for the purpose of improved speech recognition accuracy. In order to create a consistent framework for optimisation of automatic speech recognition systems, we present a maximum likelihood based algorithm for fully automatic data-driven modelling of pronunciation. We also propose an extension of this formulation in order to achieve optimal modelling of pronunciation {\em variations}. Since different words will not in general exhibit the same amount of pronunciation variation, the procedure allows words to be represented by a different number of baseforms. The methods improve the sub-word description of the vocabulary words, and has been shown to improve recognition performance on the DARPA Resource Management task.
Vis fullstendig beskrivelse