Picture for Xiaorong Cheng

Xiaorong Cheng

Do Large Language Models Judge Error Severity Like Humans?

Add code
Jun 05, 2025
Figure 1 for Do Large Language Models Judge Error Severity Like Humans?
Figure 2 for Do Large Language Models Judge Error Severity Like Humans?
Figure 3 for Do Large Language Models Judge Error Severity Like Humans?
Figure 4 for Do Large Language Models Judge Error Severity Like Humans?
Viaarxiv icon