Picture for Victor Ojewale

Victor Ojewale

What Benchmarks Don't Measure: The Case for Evaluating Abstention Competence in Autonomous Agents

Add code
Jun 01, 2026
Viaarxiv icon

Multi-lingual Functional Evaluation for Large Language Models

Add code
Jun 25, 2025
Viaarxiv icon