EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation

EMNLP 2022 |

We propose EdgeFormer — a parameter-efficient Transformer of the encoder-decoder architecture for on-device seq2seq generation, which is customized under strict computation and memory constraints. EdgeFormer proposes two novel principles for cost-effective parameterization and further enhance the model with efficient layer adaptation. We conduct extensive experiments on two practical on-device seq2seq tasks: Machine Translation and Grammatical Error Correction, and show that EdgeFormer can effectively outperform previous parameter-efficient Transformer baselines and achieve very competitive results with knowledge distillation under both the computation and memory constraints. Moreover, we release the pretrained EdgeFormer — the first publicly available pretrained model that can be easily fine-tuned for English seq2seq tasks with strong results, largely facilitating on-device seq2seq generation in practice.