Picture for Bolin Ding

Bolin Ding

E2E-REME: Towards End-to-End Microservices Auto-Remediation via Experience-Simulation Reinforcement Fine-Tuning

Add code
Apr 13, 2026
Viaarxiv icon

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

Add code
Mar 23, 2026
Viaarxiv icon

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Add code
Mar 23, 2026
Viaarxiv icon

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Add code
Mar 20, 2026
Viaarxiv icon

SeeUPO: Sequence-Level Agentic-RL with Convergence Guarantees

Add code
Feb 06, 2026
Viaarxiv icon

Accurate Table Question Answering with Accessible LLMs

Add code
Jan 06, 2026
Viaarxiv icon

Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution

Add code
Dec 11, 2025
Viaarxiv icon

d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models

Add code
Dec 10, 2025
Viaarxiv icon

AgentEvolver: Towards Efficient Self-Evolving Agent System

Add code
Nov 13, 2025
Viaarxiv icon

BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning

Add code
Oct 30, 2025
Viaarxiv icon