Is Word Error Rate a Good Indicator for Spoken Language Understanding Accuracy

  • Ye-Yi Wang ,
  • Alex Acero ,
  • Ciprian Chelba

IEEE Workshop on Automatic Speech Recognition and Understanding |

Published by Institute of Electrical and Electronics Engineers, Inc.

It is a conventional wisdom in the speech community that better speech recognition accuracy is a good indicator for better spoken language understanding accuracy, given a fixed understanding component. The findings in this work reveal that this is not always the case. More important than word error rate reduction, the language model for recognition should be trained to match the optimization objective for understanding. In this work, we applied a spoken language understanding model as the language model in speech recognition. The model was obtained with an example-based learning algorithm that optimized the understanding accuracy. Although the speech recognition word error rate is 46% higher than the trigram model, the overall slot understanding error can be reduced by as much as 17%.