A Comparative Study of Neural Network Models for Lexical Intent Classification

  • Suman Ravuri ,
  • Andreas Stolcke

Proc. IEEE Automatic Speech Recognition and Understanding Workshop |

Published by IEEE - Institute of Electrical and Electronics Engineers

Domain and intent classification are critical pre-processing steps for many speech understanding and dialog systems, as it allows for certain types of utterances to be routed to particular subsystems. In previous work, we explored many types of neural network (NN) architectures—some feedforward and some recurrent—for lexical intent classification and found that they improved upon more traditional statistical baselines. In this paper we carry out a more comprehensive comparison of NN models including the recently proposed gated recurrent unit network, for two domain/intent classification tasks. Furthermore, whereas the previous work was confined to relatively small and controlled data sets, we now include experiments based on a large set obtained from the Cortana personal assistant application.

We compare feedforward, recurrent, and gated —such as LSTM and GRU— networks against each other. On both the ATIS intent task and the much larger Cortana domain classification tasks, gated networks outperform recurrent models, which in turn outperform feedforward networks. Also, we compared standard word vector models against a representation which encodes words as sets of character n-grams to mitigate the out-of-vocabulary problem. We find that in nearly all cases, the standard word vectors outperform character-based word representations. Best results are obtained by linearly combining scores from NN models with log likelihood ratios obtained from N-gram language models.