Layer Trajectory BLSTM

Interspeech |

Organized by ISCA

Recently, we proposed layer trajectory (LT) LSTM (ltLSTM) which significantly outperforms LSTM by decoupling the functions of senone classification and temporal modeling with separate depth and time LSTMs. We further improved ltLSTM with contextual layer trajectory LSTM (cltLSTM) which uses the future context frames to predict target labels. Given bi-directional LSTM (BLSTM) also uses future context frames to improve its modeling power, in this study we first compare the performance between these two models. Then we apply the layer trajectory idea to further improve BLSTM models, in which BLSTM is in charge of modeling the temporal information while depth-LSTM takes care of senone classification. In addition, we also investigate the model performance among different LT component designs on BLSTM models. Trained with 30 thousand hours of EN-US Microsoft internal data, the proposed layer trajectory BLSTM (ltBLSTM) model improved the baseline BLSTM with up to 14.5% relative word error rate (WER) reduction across different tasks.