Context-Sensitive Evaluation and Correction of Phone Recognition Output

Proc. of Eurospeech |

Published by ISCA - International Speech Communication Association

In speech and language processing, information about the errors made by a learning system is commonly used to assess and improve its performance. Because of high computational complexity, the context of the errors is usually either ignored, or exploited in a simplistic form. The complexity becomes tractable, however, for phone recognition because of the small lexicon. For phonebased systems, an exhaustive modeling of local context is possible. Furthermore, recent research studies have shown phone recognition to be useful for several spoken language processing tasks. In this paper, we present a mechanism which learns patterns of context-sensitive errors from ASR-output aligned with the “true” phone transcriptions. We also show how this information, encoded as a context-sensitive weighted transducer, can provide a modest improvement to phone recognition accuracy even when no transcriptions are available for the domain of interest.