Picture for Haobo Wang

Haobo Wang

Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization

Add code
Jun 07, 2026
Viaarxiv icon

SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization

Add code
Jun 04, 2026
Viaarxiv icon

OPRD: On-Policy Representation Distillation

Add code
Jun 04, 2026
Viaarxiv icon

Smart Picks in the Dark: Towards Efficient RLVR for Reasoning via Tracing Metacognitive Pivots

Add code
Jun 03, 2026
Viaarxiv icon

GeoMin: Data-Efficient Semi-Supervised RLVR via Geometric Distribution Modeling

Add code
Jun 03, 2026
Viaarxiv icon

FLaG: Fine-Grained Latent Grouping for Hallucination Detection

Add code
May 29, 2026
Viaarxiv icon

Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment

Add code
May 11, 2026
Viaarxiv icon

Can LLMs Learn to Reason Robustly under Noisy Supervision?

Add code
Apr 05, 2026
Viaarxiv icon

DeltaMem: Towards Agentic Memory Management via Reinforcement Learning

Add code
Apr 02, 2026
Viaarxiv icon

Owl-AuraID 1.0: An Intelligent System for Autonomous Scientific Instrumentation and Scientific Data Analysis

Add code
Mar 31, 2026
Viaarxiv icon