July 9, 2023

ACL 2023 Multilingual Models Tutorial

Eastern Daylight Time (UTC -4)

Location: Toronto, Canada

Data Collection and Training of Multilingual LLMs

Fan, Angela, et al. “Beyond English-Centric Multilingual Machine Translation. arXiv e-prints, page.” arXiv preprint arXiv:2010.11125 (2020).

Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).

Conneau, Alexis, et al. “Unsupervised cross-lingual representation learning at scale.” arXiv preprint arXiv:1911.02116 (2019).

Conneau, Alexis, et al. “Unsupervised cross-lingual representation learning at scale.” arXiv preprint arXiv:1911.02116 (2019).

Xue, Linting, et al. “mT5: A massively multilingual pre-trained text-to-text transformer.” arXiv preprint arXiv:2010.11934 (2020).

Chi, Zewen, et al. “Xlm-e: Cross-lingual language model pre-training via electra.” arXiv preprint arXiv:2106.16138 (2021).

Liu, Yinhan, et al. “Multilingual denoising pre-training for neural machine translation.” Transactions of the Association for Computational Linguistics 8 (2020): 726-742.

Patra, Barun, et al. “Beyond english-centric bitexts for better multilingual language representation learning.” arXiv preprint arXiv:2210.14867 (2022).

Chung, Hyung Won, et al. “Unimax: Fairer and more effective language sampling for large-scale multilingual pretraining.” arXiv preprint arXiv:2304.09151 (2023).

He, Pengcheng, Jianfeng Gao, and Weizhu Chen. “Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing.” arXiv preprint arXiv:2111.09543 (2021).

Chen, Ting, et al. “A simple framework for contrastive learning of visual representations.” International conference on machine learning. PMLR, 2020.

He, Kaiming, et al. “Momentum contrast for unsupervised visual representation learning.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.

Chi, Zewen, et al. “InfoXLM: An information-theoretic framework for cross-lingual language model pre-training.” arXiv preprint arXiv:2007.07834 (2020).

Xue, Linting, et al. “mT5: A massively multilingual pre-trained text-to-text transformer.” arXiv preprint arXiv:2010.11934 (2020).

Xue, Linting, et al. “Byt5: Towards a token-free future with pre-trained byte-to-byte models.” Transactions of the Association for Computational Linguistics 10 (2022): 291-306.
.Liu, Yinhan, et al. “Multilingual denoising pre-training for neural machine translation.” Transactions of the Association for Computational Linguistics 8 (2020): 726-742.
Soltan, Saleh, et al. “Alexatm 20b: Few-shot learning using a large-scale multilingual seq2seq model.” arXiv preprint arXiv:2208.01448 (2022).
Lin, Xi Victoria, et al. “Few-shot learning with multilingual language models.” arXiv preprint arXiv:2112.10668 (2021).
Wang, Thomas, et al. “What language model architecture and pretraining objective works best for zero-shot generalization?.” International Conference on Machine Learning. PMLR, 2022.
Tay, Yi, et al. “Transcending scaling laws with 0.1% extra compute.” arXiv preprint arXiv:2210.11399 (2022).
Chung, Hyung Won, et al. “Scaling instruction-finetuned language models.” arXiv preprint arXiv:2210.11416 (2022).
Muennighoff, Niklas, et al. “Crosslingual generalization through multitask finetuning.” arXiv preprint arXiv:2211.01786 (2022).
OpenAI. 2023. GPT-4 Technical Report.
Google. 2023. PaLM 2 Technical Report.
Hyung Won Chung∗, Le Hou∗, Shayne Longpre∗ et al. 2022. Scaling Instruction-Finetuned Language Models.
Aakanksha Chowdhery∗, Sharan Narang∗, Jacob Devlin∗ et al. 2022. PaLM: Scaling Language Modeling with Pathways.
Weijia Shi et al. 2023. REPLUG: Retrieval-Augmented Black-Box Language Models
BigScienceWorkshop. 2022. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
Yi Tay et al. 2023. Transcending Scaling Laws with 0.1% Extra Compute.
Hugo Touvron∗ , Thibaut Lavril∗ , Gautier Izacard∗ et al. 2023. LLaMA: Open and Efficient Foundation Language Models.
Jack W. Rae et al. 2022. Scaling Language Models: Methods, Analysis & Insights from Training Gopher.
Gautier Izacard∗, Patrick Lewis∗. 2023. Atlas: Few-shot Learning with Retrieval Augmented Language Models.
Aohan Zeng∗, Xiao Liu* et al. 2022. GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL.
Saleh Soltan et al. 2022. AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2seq Model.
Xi Victoria Lin et al. 2022. Few-shot Learning with Multilingual Generative Language Models.
Zhihong Chen et al. 2023. Phoenix: Democratizing ChatGPT across Languages.

Prompting Strategies for Multilingual LLMs

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (opens in new tab)

Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won
Chung, Yi Tay, Sebastian Ruder, Denny Zhou, et al. Language models are multilingual chain-of-thought
reasoners. arXiv preprint arXiv:2210.03057, 2022.

Lifu Tu, Caiming Xiong, and Yingbo Zhou. Prompt-tuning can be much better than fine-tuning on cross-lingual understanding with multilingual language models, 2022.

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed H Chi, Quoc V Le, Denny Zhou et al. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems

Mengjie Zhao and Hinrich Schütze. Discrete and soft prompting for multilingual models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8547–8555, 2021

Lianzhe Huang, Shuming Ma, Dongdong Zhang, Furu Wei, and Houfeng Wang. Zero-shot cross-lingual transfer of prompt-based tuning with a unified multilingual prompt, 2022.

Haoyang Huang, Tianyi Tang, Dongdong Zhang, Wayne Xin Zhao, Ting Song, Yan Xia, and Furu Wei. Not all languages are created equal in llms: Improving multilingual capability by cross-lingual-thought prompting, 2023.

Yuxuan Chen, David Harbecke, and Leonhard Hennig. Multilingual relation classification via efficient and
effective prompting. arXiv preprint arXiv:2210.13838, 2022.

Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot LLMs, Akshay Nambi, Vaibhav Balloli, Mercy Ranjit, Tanuja Ganu, Kabir Ahuja, Sunayana Sitaram, Kalika Bali

Evaluation, Interpretability and Analysis of Multilingual LLMs

Datasets

Alexis Conneau, Ruty Rinott, Guillaume Lample, Adina Williams, Samuel Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. XNLI: Evaluating Cross-lingual Sentence Representations (opens in new tab). In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2475–2485, Brussels, Belgium. Association for Computational Linguistics.

Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. 2019. PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification (opens in new tab). In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3687–3692, Hong Kong, China. Association for Computational Linguistics.

Nivre, Joakim and Abrams, Mitchell and Agi{\’c}, {\v{Z}}eljko and Ahrenberg, Lars and Antonsen, Lene and Aranzabe, Maria Jesus and Arutie, Gashaw and Asahara, Masayuki and Ateyah, Luma and Attia, Mohammed and others. Universal Dependencies 2.2. 2018.

Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, and Heng Ji. 2017. Cross-lingual Name Tagging and Linking for 282 Languages (opens in new tab). In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1946–1958, Vancouver, Canada. Association for Computational Linguistics.

Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2020. On the Cross-lingual Transferability of Monolingual Representations (opens in new tab). In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4623–4637, Online. Association for Computational Linguistics.

Patrick Lewis, Barlas Oguz, Ruty Rinott, Sebastian Riedel, and Holger Schwenk. 2020. MLQA: Evaluating Cross-lingual Extractive Question Answering (opens in new tab). In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7315–7330, Online. Association for Computational Linguistics.

Jonathan H. Clark, Eunsol Choi, Michael Collins, Dan Garrette, Tom Kwiatkowski, Vitaly Nikolaev, and Jennimaria Palomaki. 2020. TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages (opens in new tab). Transactions of the Association for Computational Linguistics, 8:454–470.

Tahmid Hasan, Abhik Bhattacharjee, Md. Saiful Islam, Kazi Mubasshir, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, and Rifat Shahriyar. 2021. XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages (opens in new tab). In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 4693–4703, Online. Association for Computational Linguistics.

Sumanth Doddapaneni, Rahul Aralikatte, Gowtham Ramesh, Shreya Goyal, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar. Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages. 2020.

Aman Kumar, Himani Shrotriya, Prachi Sahu, Amogh Mishra, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M. Khapra, and Pratyush Kumar. 2022. IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages (opens in new tab). In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5363–5394, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.

David Adelani, Graham Neubig, Sebastian Ruder, Shruti Rijhwani, Michael Beukman, Chester Palen-Michel, Constantine Lignos, Jesujoba Alabi, Shamsuddeen Muhammad, Peter Nabende, Cheikh M. Bamba Dione, Andiswa Bukula, Rooweither Mabuya, Bonaventure F. P. Dossou, Blessing Sibanda, Happy Buzaaba, Jonathan Mukiibi, Godson Kalipe, Derguene Mbaye, et al.. 2022. MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition (opens in new tab). In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4488–4508, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.

Cheikh M. Bamba Dione, David Adelani, Peter Nabende, Jesujoba Alabi, Thapelo Sindane, Happy Buzaaba, Shamsuddeen Hassan Muhammad, Chris Chinenye Emezue and others. MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages. 2023.

Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan Vladimir Meza Ruiz, Gustavo Giménez-Lugo, Elisabeth Mager, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, Thang Vu, and Katharina Kann. 2022. AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages (opens in new tab). In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6279–6299, Dublin, Ireland. Association for Computational Linguistics.

Arnav Mhaske, Harshit Kedia, Sumanth Doddapaneni, Mitesh M. Khapra, Pratyush Kumar, Rudra Murthy V, Anoop Kunchukuttan. 2023. Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages

Benchmarking Exercises

Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4411-4421, 2020.

Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Dan Garrette, Graham Neubig, and Melvin Johnson. 2021. XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation (opens in new tab). In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10215–10245, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.

Kabir Ahuja, Harshita Diddee, Rishav Hada, Millicent Ochieng, Krithika Ramesh, Prachi Jain, Akshay Nambi, Tanuja Ganu, Sameer Segal, Maxamed Axmed, Kalika Bali, Sunayana Sitaram. 2023. MEGA: Multilingual Evaluation of Generative AI.

Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi. 2023. BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer.

Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov. 2023. Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models

Evaluation Beyond Task Performance

Kabir Ahuja, Sunayana Sitaram, Sandipan Dandapat, and Monojit Choudhury. 2022. On the Calibration of Massively Multilingual Language Models (opens in new tab). In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4310–4323, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.

Zhengping Jiang, Anqi Liu, and Benjamin Van Durme. 2022. Calibrating Zero-shot Cross-lingual (Un-)structured Predictions (opens in new tab). In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2648–2674, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.

Masahiro Kaneko, Aizhan Imankulova, Danushka Bollegala, and Naoaki Okazaki. 2022. Gender Bias in Masked Language Models for Multiple Languages (opens in new tab). In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2740–2750, Seattle, United States. Association for Computational Linguistics.

Aniket Vashishtha, Kabir Ahuja, and Sunayana Sitaram. 2023. On Evaluating and Mitigating Gender Biases in Multilingual Settings.

Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4902–4912, Online. Association for Computational Linguistics.

Karthikeyan K, Shaily Bhatt, Pankaj Singh, Somak Aditya, Sandipan Dandapat, Sunayana Sitaram, and Monojit Choudhury. 2022. Multilingual CheckList: Generation and Evaluation (opens in new tab). In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 282–295, Online only. Association for Computational Linguistics.

Challenges in Multilingual Evaluation

Kabir Ahuja, Sandipan Dandapat, Sunayana Sitaram, and Monojit Choudhury. 2022. Beyond Static models and test sets: Benchmarking the potential of pre-trained models across tasks and languages. In Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, pages 64–74, Dublin, Ireland. Association for Computational Linguistics.

Kabir Ahuja, Shanu Kumar, Sandipan Dandapat, and Monojit Choudhury. 2022. Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models (opens in new tab). In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5454–5467, Dublin, Ireland. Association for Computational Linguistics.

Mengzhou Xia, Antonios Anastasopoulos, Ruochen Xu, Yiming Yang, and Graham Neubig. 2020. Predicting Performance for Natural Language Processing Tasks (opens in new tab). In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8625–8646, Online. Association for Computational Linguistics.

Kabir Ahuja, Antonios Anastasopoulos, Barun Patra, Graham Neubig, Monojit Choudhury, Sandipan Dandapat, Sunayana Sitaram, and Vishrav Chaudhary. 2022. Proceedings of the First Workshop on Scaling Up Multilingual Evaluation (opens in new tab). Association for Computational Linguistics, Online, edition.

Fangyu Liu, Emanuele Bugliarello, Edoardo Maria Ponti, Siva Reddy, Nigel Collier, and Desmond Elliott. 2021. Visually Grounded Reasoning across Languages and Cultures (opens in new tab). In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10467–10485, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.

Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-Wei Chang, Kristina Toutanova. 2021. Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

Analysis and Interpretability

Anne Lauscher, Vinit Ravishankar, Ivan Vulić, and Goran Glavaš. 2020. From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers (opens in new tab). In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4483–4499, Online. Association for Computational Linguistics.

Karthikeyan K and Zihan Wang and Stephen Mayhew and Dan Roth, Cross-Lingual Ability of Multilingual BERT: An Empirical Study (opens in new tab) ICLR (2020)

Ethan A. Chi, John Hewitt, and Christopher D. Manning. 2020. Finding Universal Grammatical Relations in Multilingual BERT (opens in new tab). In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5564–5577, Online. Association for Computational Linguistics.

Karolina Stanczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, and Isabelle Augenstein. 2022. Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models (opens in new tab). In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1589–1598, Seattle, United States. Association for Computational Linguistics.

Aaron Mueller, Yu Xia, and Tal Linzen. 2022. Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models (opens in new tab). In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 95–109, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.

What multilingual evaluation tells us about the current state of NLP

Damian Blasi, Antonios Anastasopoulos, and Graham Neubig. 2022. Systematic Inequalities in Language Technology Performance across the World’s Languages (opens in new tab). In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5486–5505, Dublin, Ireland. Association for Computational Linguistics.

Responsible AI for Multilingual LLMs

Socio-Cultural Aspects

Małgorzata Suszczy´nska. 1999. Apologizing in english, polish and hungarian: Different languages, different strategies. Journal of Pragmatics, 31(8):1053–1065.

Jimin Sun, Hwijeen Ahn, Chan Young Park, Yulia Tsvetkov, and David R. Mortensen. 2021. Crosscultural similarity features for cross-lingual transfer learning of pragmatically motivated tasks. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2403–2414, Online. Association for Computational Linguistics.

Dong Nguyen, A. Seza Do˘gruöz, Carolyn P. Rosé, and Franciska de Jong. 2016. Computational Sociolinguistics: A Survey. Computational Linguistics, 42(3):537–593.

Daniel Hershcovich, et al. 2022. Challenges and strategies in crosscultural NLP. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6997–7013, Dublin, Ireland. Association for Computational Linguistics.

Névéol et al. French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English. ACL 2022

Blodgett et al., Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. ACL-IJCNLP 2021

Gender Bias and Grammatical Gender

Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, and Kai-Wei Chang. 2019. Examining gender bias in languages with grammatical gender. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5276–5284

Jieyu Zhao, Subhabrata Mukherjee, Saghar Hosseini, Kai-Wei Chang, and Ahmed Hassan Awadallah. 2020. Gender bias in multilingual embeddings and cross-lingual transfer. In Proceedings of the 58^th Annual Meeting of the Association for Computational Linguistics, pages 2896–2907, Online. Association for Computational Linguistics.

Sheng Liang, Philipp Dufter, and Hinrich Schütze. 2020. Monolingual and multilingual reduction of gender bias in contextualized representations. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5082–5093, Barcelona, Spain (Online). International Committee on Computational Linguistics.

Masahiro Kaneko, Aizhan Imankulova, Danushka Bollegala, and Naoaki Okazaki. 2022. Gender bias in masked language models for multiple languages. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2740–2750, Seattle, United States. Association for Computational Linguistics.

[MITIGATION] Xiaolei Huang. 2022. Easy adaptation to mitigate gender bias in multilingual text classification. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 717–723, Seattle, United States. Association for Computational Linguistics.

[Model Compression] Jaimeen Ahn, Hwaran Lee, Jinhwa Kim, and Alice Oh. 2022. Why knowledge distillation amplifies gender bias and how to mitigate from the perspective of DistilBERT. In Proceedings of the 4th Workshopon Gender Bias in Natural Language Processing (GeBNLP), pages 266–272, Seattle, Washington. Association for Computational Linguistics.

Zmigrod et al., Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology (opens in new tab), ACL 2019

Bias and Fairness

Jialu Wang, Yang Liu, and Xin Wang. 2022. Assessing multilingual fairness in pre-trained multimodal representations. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2681–2695, Dublin, Ireland. Association for Computational Linguistics.

Zeerak Talat, et al. 2022. You reap what you sow: On the challenges of bias evaluation under multilingual settings. In Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models, pages 26–41, virtual+Dublin. Association for Computational Linguistics.

Aristides Milios and Parishad BehnamGhader. 2022. An analysis of social biases present in bert variants across multiple languages. ArXiv, abs/2211.14402.

Cristina España-Bonet and Alberto Barrón-Cedeño. 2022. The (undesired) attenuation of human biases by multilinguality. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages –, Online and Abu Dhabi, UAE. Association for Computational Linguistics.

[Dataset] Ilias Chalkidis, et al., 2022. FairLex: A multilingual benchmark for evaluating fairness in legal text processing. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4389–4406, Dublin, Ireland. Association for Computational Linguistics.

Hate Speech, Toxicity and Sentiment

Xiaolei Huang, Linzi Xing, Franck Dernoncourt, and Michael J. Paul. 2020. Multilingual Twitter corpus and baselines for evaluating demographic bias in hate speech recognition. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1440–1448, Marseille, France. European Language Resources Association.

António Câmara, Nina Taneja, Tamjeed Azad, Emily Allaway, and Richard Zemel. 2022. Mapping the multilingual margins: Intersectional biases of sentiment analysis systems in English, Spanish, and Arabic.In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 90–106, Dublin, Ireland. Association for Computational Linguistics.

Discursive Aspects (Decolonizing RAI)

Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. 2021. Re-imagining algorithmic fairness in india and beyond. CoRR, abs/2101.09995.

Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, and Vinodkumar Prabhakaran. 2022. Recontextualizing fairness in NLP: The case of India. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 727–740, Online only.

Krithika Ramesh, Sunayana Sitaram, Monojit Choudhury. 2023. Fairness in Language Models Beyond English: Gaps and Challenges. Findings of EACL 2023

Distributive Aspects

Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, and Monojit Choudhury. 2020. The state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6282–6293, Online. Association for Computational Linguistics.

Monojit Choudhury and Amit Deshpande. 2021. How linguistically fair are multilingual pre-trained language models? Proceedings of the AAAI Conference on Artificial Intelligence, 35(14):12710–12718.

General

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, page 610–623, New York, NY, USA. Association for Computing Machinery.

Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, Online. Association for Computational Linguistics.

Working with Multilingual Language Communities

Abraham, Basil, et al. “Crowdsourcing Speech Data for Low-Resource Languages from Low-Income Workers.” Proceedings of the 12th Language Resources and Evaluation Conference. 2020.

Almaliki, Malik, et al. “ABMM: Arabic BERT-Mini Model for Hate-Speech Detection on Social Media.” Electronics 12.4 (2023): 1048.

Bird, Steven. “Decolonising Speech and Language Technology.” Proceedings of the 28th International Conference on Computational Linguistics. 2020.

Bird, Steven. “Local Languages, Third Spaces, and Other High-Resource Scenarios.” Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.

Chopra, Manu, et al. “Exploring Crowdsourced Work in Low-Resource Settings.” Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 2019.

Currin, Christopher Brian, et al. “A Framework for Grassroots Research Collaboration in Machine Learning and Global Health.” 2023 ICLR First Workshop on “Machine Learning & Global Health”. 2023.

Diddee, Harshita, Bali, Kalika, Choudhury, Monojit, Ganu, Tanuja, and Dandapat, Sandipan. 2022. Too Brittle to Touch: Comparing the Stability of Quantization and Distillation towards Developing Low-Resource MT Models. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 870–885, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.

Diddee, Harshita, Bali, Kalika, Choudhury, Monojit, and Mukhija, Namrata. 2022. The Six Conundrums of Building and Deploying Language Technologies for Social Good. In ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS) (COMPASS ’22). Association for Computing Machinery, New York, NY, USA, 12–19.

Goswami, Dipam, et al. “Analysis of Word-level Embeddings for Indic Languages on AI4Bharat-IndicNLP Corpora.” 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON). IEEE, 2021.

Gitau, Catherine, et al. “Masakhane Web: A Machine Translation Platform for African Languages.” (2023)

Harrigan, Atticus, et al. “Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages.” Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages. 2023.

Joshi, Pratik, et al. “Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities.” arXiv preprint arXiv:1912.03457 (2019).

Kann, Katharina, et al. “AmericasNLI: Machine Translation and Natural Language Inference Systems for Indigenous Languages of the Americas.” Frontiers in Artificial Intelligence 5 (2022): 266.

Koto, Fajri, et al. “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP.” arXiv preprint arXiv:2011.00677 (2020).

Kumar, Ritesh, et al. “Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi.” arXiv preprint arXiv:2206.12931 (2022).

Kunchukuttan, Anoop, et al. “Ai4bharat-indicnlp Corpus: Monolingual Corpora and Word Embeddings for Indic Languages.” arXiv preprint arXiv:2005.00085 (2020).

Mehta, Devansh, et al. “Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi.” arXiv preprint arXiv:2211.16172 (2022).

Moeller, Sarah, et al. “Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages.” Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages. 2022.

Moradshahi, Mehrad, et al. “X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents.” arXiv preprint arXiv:2306.17674 (2023).

Muhammad, Shamsuddeen Hassan, et al. “Naijasenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis.” arXiv preprint arXiv:2201.08277 (2022).
AS Doğruöz and S Sitaram, Language technologies for low resource languages: sociolinguistic and multilingual insights (opens in new tab), LREC 2022.