Picture for Youssef Benchekroun

Youssef Benchekroun

LongTail-Swap: benchmarking language models' abilities on rare words

Add code
Oct 05, 2025
Viaarxiv icon

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon