MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning

Mengzhou Xia; Guoqing Zheng; Subhabrata (Subho) Mukherjee; Milad Shokouhi; Graham Neubig; Ahmed Awadallah

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning

Mengzhou Xia ,
Guoqing Zheng ,
Subhabrata (Subho) Mukherjee ,
Milad Shokouhi ,
Graham Neubig ,
Ahmed Awadallah

NAACL 2021 | May 2021

Published by NAACL 2021

Download BibTex

The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages. However, for extremely low-resource languages without large-scale monolingual corpora for pre-training or sufficient annotated data for fine-tuning — transfer learning remains an under-studied and challenging task. Moreover, recent work shows that multilingual representations are surprisingly disjoint across languages, bringing additional challenges for transfer onto extremely low-resource languages. In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from the source language to a target one and brings their representation spaces closer for effective transfer. Extensive experiments on real-world low-resource languages — without access to large-scale monolingual corpora or large amounts of labeled data — for tasks like cross-lingual sentiment analysis and named entity recognition show the effectiveness of our approach. Code for MetaXL is publicly available at github.com/microsoft/MetaXL.

Publication Downloads

Meta Representation Transformation for Low-resource Cross-Lingual Learning [Code]

May 24, 2021

This is a source code release for a published research at NAACL 2021. Paper Title: MetaXL: Meta Representation Transformation for Low-resource Cross-Lingual Learning Paper Abstract: The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low resource languages. However, for extremely low-resource languages without large-scale monolingual corpora for pre-training or sufficient annotated data for fine-tuning –transfer learning remains an under-studied and challenging task. Moreover, recent work shows that multilingual representations are surprisingly disjoint across languages (Singh et al., 2019), bringing additional challenges for transfer onto extremely low-resource languages. In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from the auxiliary languages to a target one and brings their representation spaces closer for effective transfer. Extensive experiments on real-world low-resource languages –without access to large-scale monolingual corpora or large amounts of labeled data –for tasks like cross-lingual sentiment analysis and named entity recognition show the effectiveness of our approach

Download Data