Picture for Junchi Yao

Junchi Yao

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Add code
Sep 10, 2025
Viaarxiv icon

Mitigating Behavioral Hallucination in Multimodal Large Language Models for Sequential Images

Add code
Jun 08, 2025
Viaarxiv icon

Understanding the Repeat Curse in Large Language Models from a Feature Perspective

Add code
Apr 19, 2025
Viaarxiv icon

Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements

Add code
Feb 18, 2025
Viaarxiv icon