LITMUS: Linguistically Inspired Training and testing of MUltilingual Systems
Transformer-based Language Models have revolutionized the field of NLP and have shown great improvements in various benchmarks and are being used to power various NLP applications today. Multilingual versions of these models can also potentially serve many low-resource languages for which labeled data is not available by making use of the zer0-shot paradigm. However, there remain challenges associated with deploying such models, including the question of evaluating the models across a wide variety of languages which may not be present in standard evaluation benchmarks.
The goal of Project LITMUS is to discover strategies to evaluate massive multilingual models and also suggest data collection and training strategies to improve the performance of these models.
You may also be interested in the other projects our group works on: Project ELLORA (Enabling Low Resource Languages) and Project Mélange (Understanding Mixed Language).