Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

May 25, 2025

Saman Sarker Joy

Figure 1 for BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Figure 2 for BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Figure 3 for BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Figure 4 for BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Share this with someone who'll enjoy it:

Abstract:The Massive Multitask Language Understanding (MMLU) benchmark has been widely used to evaluate language models across various domains. However, existing MMLU datasets primarily focus on high-resource languages such as English, which leaves low-resource languages like Bengali underrepresented. In this paper, we introduce BnMMLU, a benchmark to evaluate the multitask language understanding capabilities of Bengali in language models. The dataset spans 23 domains, including science, humanities, mathematics and general knowledge and is structured in a multiple-choice format to assess factual knowledge, application-based problem-solving and reasoning abilities of language models. It consists of 138,949 question-option pairs. We benchmark several proprietary and open-source large language models (LLMs) on the BnMMLU test set. Additionally, we annotate the test set with three cognitive categories-factual knowledge, procedural application and reasoning-to gain deeper insights into model strengths and weaknesses across various cognitive tasks. The results reveal significant performance gaps, highlighting the need for improved pre-training and fine-tuning strategies tailored to Bengali data. We release the dataset and benchmark results to facilitate further research in this area.

* 18 pages, 9 figures, 5 tables; Code & dataset available at https://github.com/samanjoy2/bnmmlu

View paper on

Share this with someone who'll enjoy it:

Title:BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Paper and Code