Picture for Chenxin Liu

Chenxin Liu

MADE: Beyond Scoring via a Multilingual Agentic Diagnosing Engine for Fine-Grained Evaluation Insights

Add code
Jun 05, 2026
Viaarxiv icon

The GaoYao Benchmark: A Comprehensive Framework for Evaluating Multilingual and Multicultural Abilities of Large Language Models

Add code
Apr 22, 2026
Viaarxiv icon