HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

Shantipriya Parida; Idris Abdulmumin; Shamsuddeen Hassan Muhammad; Aneesh Bose; Guneet Kohli; Ibrahim Said Ahmad; Ketan Kotwal; Sayan Deb Sarkar; Ondej Bojar; Habeebah Kakudi

HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

Shantipriya Parida ,
Idris Abdulmumin ,
Shamsuddeen Hassan Muhammad ,
Aneesh Bose ,
Guneet Kohli ,
Ibrahim Said Ahmad ,
Ketan Kotwal ,
Sayan Deb Sarkar ,
Ondej Bojar ,
Habeebah Kakudi

ACL 2023 | July 2023

Download BibTex

This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fashion that guarantees their semantic match with the corresponding visual information. We conducted several baseline experiments on the dataset, including visual question answering, visual question elicitation, text-only and multimodal machine translation.