Few-shot Learning

Deep neural networks including pre-trained language models like BERT, Turing-NLG and GPT-3 require thousands of labeled training examples to obtain state-of-the-art performance for downstream tasks and applications. Such large number of labeled examples are difficult and expensive to acquire in practice — as we scale these models to hundreds of different languages, thousands of different tasks and domains, as well as for compliant reasons while dealing with sensitive user data. In this project, we develop techniques for few-shot and zero-shot learning to obtain state-of-the-art performance with Multilingual pre-TRainEd ModEls (XTREME) while using very few to no labels for the target task.