OPTIMUS: Organizing Sentences via Pre-trained Modeling of a Latent Space

Chunyuan Li; Xiang Gao; Yuan Li; Xiujun Li; Baolin Peng; Yizhe Zhang; Jianfeng Gao

OPTIMUS: Organizing Sentences via Pre-trained Modeling of a Latent Space

Chunyuan Li ,
Xiang Gao ,
Yuan Li ,
Xiujun Li ,
Baolin Peng ,
Yizhe Zhang ,
Jianfeng Gao

April 2020

Download BibTex

When trained effectively, the Variational Autoencoder (VAE) (Kingma and Welling, 2013; Bowman et al., 2016) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model OPTIMUS 1. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, OPTIMUS enables guided language generation from an abstract level using the latent vectors. Compared with BERT, OPTIMUS can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of OPTIMUS. It achieves new state-of-the-art on VAE language modeling benchmarks.

Publication Downloads

Optimus

April 7, 2020

Optimus: the first large-scale pre-trained VAE language model

Download Data