Picture for Haiyuan Liang

Haiyuan Liang

FRAbench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities

Add code
May 19, 2025
Viaarxiv icon

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Add code
Apr 26, 2025
Figure 1 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 2 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 3 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 4 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Viaarxiv icon