Picture for Sheikh Shafayat

Sheikh Shafayat

Can Large Reasoning Models Self-Train?

Add code
May 27, 2025
Viaarxiv icon

BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge

Add code
May 27, 2025
Viaarxiv icon

A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls

Add code
Dec 02, 2024
Viaarxiv icon

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English

Add code
Mar 16, 2024
Viaarxiv icon

Multi-FAct: Assessing Multilingual LLMs' Multi-Regional Knowledge using FActScore

Add code
Mar 01, 2024
Viaarxiv icon

LangBridge: Multilingual Reasoning Without Multilingual Supervision

Add code
Jan 19, 2024
Viaarxiv icon