Picture for Lei Feng

Lei Feng

Southeast University

Towards Safer Large Reasoning Models by Promoting Safety Decision-Making before Chain-of-Thought Generation

Add code
Mar 18, 2026
Viaarxiv icon

Variational Rectification Inference for Learning with Noisy Labels

Add code
Mar 18, 2026
Viaarxiv icon

Test-Time Attention Purification for Backdoored Large Vision Language Models

Add code
Mar 13, 2026
Viaarxiv icon

FastBUS: A Fast Bayesian Framework for Unified Weakly-Supervised Learning

Add code
Feb 28, 2026
Viaarxiv icon

Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks

Add code
Feb 26, 2026
Viaarxiv icon

When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

Add code
Feb 19, 2026
Viaarxiv icon

Phase-Aware Mixture of Experts for Agentic Reinforcement Learning

Add code
Feb 19, 2026
Viaarxiv icon

PhGPO: Pheromone-Guided Policy Optimization for Long-Horizon Tool Planning

Add code
Feb 14, 2026
Viaarxiv icon

Mitigating Mismatch within Reference-based Preference Optimization

Add code
Feb 12, 2026
Viaarxiv icon

Online Causal Kalman Filtering for Stable and Effective Policy Optimization

Add code
Feb 11, 2026
Viaarxiv icon