Transformer-XH: Multi-evidence Reasoning with Extra Hop Attention

Chen Zhao; Chenyan Xiong; Corby Rosset; Xia Song; Paul Bennett; Saurabh Tiwary

Transformer-XH: Multi-evidence Reasoning with Extra Hop Attention

Chen Zhao ,
Chenyan Xiong ,
Corby Rosset ,
Xia Song ,
Paul Bennett ,
Saurabh Tiwary

The Eighth International Conference on Learning Representations (ICLR 2020) | April 2020

Download BibTex

Transformers have obtained significant success modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as trees and graphs; many tasks require reasoning with evidence scattered across multiple pieces of texts. This paper presents Transformer-XH, which uses eXtra Hop attention to enable the intrinsic modeling of structured texts in a fully data-driven way. Its new attention mechanism naturally “hops” across the connected text sequences in addition to attending over tokens within each sequence. Thus, Transformer-XH better conducts multi-evidence reasoning by propagating information between multiple documents, constructing global contextualized representations, and jointly reasoning over multiple pieces of evidence. On multi-hop question answering, Transformer-XH leads to a simpler multi-hop QA system which outperforms previous state-of-the-art on the HotpotQA FullWiki setting. On FEVER fact verification, applying Transformer-XH provides state-of-the-art accuracy and excels on claims whose verification requires multiple evidence

Publication Downloads

Transformer-XH

March 3, 2020

Multi-Evidence Reasoning with Transformer eXtra Hop Attention

Download Data