Picture for Shiting Huang

Shiting Huang

Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models

Add code
Feb 10, 2026
Viaarxiv icon

ADORA: Training Reasoning Models with Dynamic Advantage Estimation on Reinforcement Learning

Add code
Feb 10, 2026
Viaarxiv icon

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Add code
Feb 02, 2026
Viaarxiv icon