Picture for Hardy Chen

Hardy Chen

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Add code
Apr 22, 2026
Viaarxiv icon

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Add code
Apr 06, 2026
Viaarxiv icon

Omni-MMSI: Toward Identity-attributed Social Interaction Understanding

Add code
Mar 31, 2026
Viaarxiv icon

Kestrel: Grounding Self-Refinement for LVLM Hallucination Mitigation

Add code
Mar 17, 2026
Viaarxiv icon

Reasoning While Asking: Transforming Reasoning Large Language Models from Passive Solvers to Proactive Inquirers

Add code
Jan 29, 2026
Viaarxiv icon

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

Add code
Nov 10, 2025
Viaarxiv icon

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Add code
Apr 10, 2025
Figure 1 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 2 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 3 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 4 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Viaarxiv icon

ViLBench: A Suite for Vision-Language Process Reward Modeling

Add code
Mar 26, 2025
Viaarxiv icon