Picture for Yi R. Fung

Yi R. Fung

Scalable Token-Level Hallucination Detection in Large Language Models

Add code
May 12, 2026
Viaarxiv icon

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

Add code
May 11, 2026
Viaarxiv icon

GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces

Add code
Apr 05, 2026
Viaarxiv icon

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Add code
Mar 04, 2026
Viaarxiv icon

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Add code
Feb 26, 2026
Viaarxiv icon

Supervised Fine-Tuning Needs to Unlock the Potential of Token Priority

Add code
Feb 01, 2026
Viaarxiv icon

Empowering Reliable Visual-Centric Instruction Following in MLLMs

Add code
Jan 06, 2026
Viaarxiv icon

EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce

Add code
Dec 11, 2025
Figure 1 for EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce
Figure 2 for EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce
Figure 3 for EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce
Figure 4 for EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce
Viaarxiv icon

Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

Add code
Nov 12, 2025
Viaarxiv icon

Lean4Physics: Comprehensive Reasoning Framework for College-level Physics in Lean4

Add code
Oct 30, 2025
Viaarxiv icon