Picture for Chenghao Xiao

Chenghao Xiao

Translation or Recitation? Calibrating Evaluation Scores for Machine Translation of Extremely Low-Resource Languages

Add code
Mar 26, 2026
Viaarxiv icon

MAEB: Massive Audio Embedding Benchmark

Add code
Feb 17, 2026
Viaarxiv icon

RIGOURATE: Quantifying Scientific Exaggeration with Evidence-Aligned Claim Evaluation

Add code
Jan 07, 2026
Viaarxiv icon

Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Add code
Sep 04, 2025
Figure 1 for Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Figure 2 for Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Figure 3 for Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Figure 4 for Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Viaarxiv icon

VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Add code
Jul 30, 2025
Figure 1 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 2 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 3 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 4 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Viaarxiv icon

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Add code
Jun 11, 2025
Viaarxiv icon

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Add code
Jun 08, 2025
Figure 1 for Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Figure 2 for Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Figure 3 for Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Figure 4 for Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Viaarxiv icon

Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Add code
Apr 29, 2025
Figure 1 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Figure 2 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Figure 3 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Figure 4 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Viaarxiv icon

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

Add code
Apr 18, 2025
Viaarxiv icon

MIEB: Massive Image Embedding Benchmark

Add code
Apr 14, 2025
Viaarxiv icon