DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

Ziming Mao; Chen Henry Wu; Ansong Ni; Yushen Zhang; Rui Zhang; Tao Yu; Budhaditya Deb; Chenguang Zhu; Ahmed Awadallah; Dragomir Radev

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

Ziming Mao ,
Chen Henry Wu ,
Ansong Ni ,
Yushen Zhang ,
Rui Zhang ,
Tao Yu ,
Budhaditya Deb ,
Chenguang Zhu ,
Ahmed Awadallah ,
Dragomir Radev

ACL 2022 | May 2022

Download BibTex

Transformer-based models have achieved state-of-the-art performance on short text summarization. However, they still struggle with long-input summarization. In this paper, we present a new approach for long-input summarization: Dynamic Latent Extraction for Abstractive Summarization. We jointly train an extractor with an abstractor and treat the extracted text snippets as the latent variable. We propose extractive oracles to provide the extractor with a strong learning signal. We introduce consistency loss, which encourages the extractor to approximate the averaged dynamic weights predicted by the generator. We conduct extensive tests on two long-input summarization datasets, GovReport (document) and QMSum (dialogue). Our model significantly outperforms the current state-of-the-art, including a 6.21 ROUGE-2 improvement on GovReport and a 2.13 ROUGE-1 improvement on QMSum. Further analysis shows that the dynamic weights make our generation process highly interpretable. Our code will be publicly available upon publication.