Picture for Yilun Zhao

Yilun Zhao

SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing

Add code
Jun 05, 2025
Viaarxiv icon

Table-R1: Inference-Time Scaling for Table Reasoning

Add code
May 29, 2025
Viaarxiv icon

VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Add code
May 29, 2025
Viaarxiv icon

Judging with Many Minds: Do More Perspectives Mean Less Prejudice?

Add code
May 26, 2025
Viaarxiv icon

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Add code
May 21, 2025
Viaarxiv icon

Z1: Efficient Test-time Scaling with Code

Add code
Apr 01, 2025
Viaarxiv icon

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Add code
Mar 26, 2025
Viaarxiv icon

Survey on Evaluation of LLM-based Agents

Add code
Mar 20, 2025
Viaarxiv icon

MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning

Add code
Mar 10, 2025
Viaarxiv icon

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Add code
Mar 06, 2025
Viaarxiv icon