Picture for Xiaorong Cheng

Xiaorong Cheng

Do Large Language Models Judge Error Severity Like Humans?

Add code
Jun 05, 2025
Viaarxiv icon