Picture for Jiahao Ying

Jiahao Ying

Disentangling Language and Culture for Evaluating Multilingual Large Language Models

Add code
May 30, 2025
Viaarxiv icon

FRAbench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities

Add code
May 19, 2025
Viaarxiv icon

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Add code
Apr 26, 2025
Viaarxiv icon

Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law

Add code
Apr 10, 2025
Viaarxiv icon

SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia

Add code
Feb 10, 2025
Viaarxiv icon

EvoWiki: Evaluating LLMs on Evolving Knowledge

Add code
Dec 18, 2024
Viaarxiv icon

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

Add code
Aug 21, 2024
Figure 1 for Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning
Figure 2 for Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning
Figure 3 for Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning
Figure 4 for Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning
Viaarxiv icon

LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement

Add code
Jun 29, 2024
Viaarxiv icon

QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism

Add code
Jun 19, 2024
Viaarxiv icon

A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential

Add code
Jun 06, 2024
Figure 1 for A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
Figure 2 for A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
Figure 3 for A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
Figure 4 for A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
Viaarxiv icon