Picture for Bolin Ding

Bolin Ding

Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning

Add code
Jun 18, 2026
Viaarxiv icon

Beyond Domains: Reusing Web Skills via Transferable Interaction Patterns

Add code
Jun 16, 2026
Viaarxiv icon

RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation

Add code
Jun 10, 2026
Viaarxiv icon

AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning

Add code
Jun 03, 2026
Viaarxiv icon

Clipping Bottleneck: Stabilizing RLVR via Stochastic Recovery of Near-Boundary Signals

Add code
May 21, 2026
Viaarxiv icon

E2E-REME: Towards End-to-End Microservices Auto-Remediation via Experience-Simulation Reinforcement Fine-Tuning

Add code
Apr 13, 2026
Viaarxiv icon

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Add code
Mar 23, 2026
Viaarxiv icon

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

Add code
Mar 23, 2026
Viaarxiv icon

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Add code
Mar 20, 2026
Viaarxiv icon

SeeUPO: Sequence-Level Agentic-RL with Convergence Guarantees

Add code
Feb 06, 2026
Viaarxiv icon