Improving Event Extraction via Multimodal Integration

  • Tongtao Zhang ,
  • Spencer Whitehead ,
  • Hanwang Zhang ,
  • ,
  • Joseph Ellis ,
  • Lifu Huang ,
  • Wei Liu ,
  • Heng Ji ,
  • Shih-Fu Chang

Proceedings of the 25th ACM international conference on Multimedia |

Published by ACM

In this paper, we focus on improving Event Extraction (EE) by
incorporating visual knowledge with words and phrases from text
documents. We €rst discover visual paŠerns from large-scale textimage
pairs in a weakly-supervised manner and then propose a
multimodal event extraction algorithm where the event extractor is
jointly trained with textual features and visual paŠerns. Extensive
experimental results on benchmark data sets demonstrate that the
proposed multimodal EE method can achieve signi€cantly beŠer
performance on event extraction: absolute 7.1% F-score gain on
event trigger labeling and 8.5% F-score gain on event argument
labeling.