Picture for Binbin Zheng

Binbin Zheng

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

Add code
Apr 12, 2026
Viaarxiv icon

From $\boldsymbol{\logπ}$ to $\boldsymbolπ$: Taming Divergence in Soft Clipping via Bilateral Decoupled Decay of Probability Gradient Weight

Add code
Mar 15, 2026
Viaarxiv icon

MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning

Add code
Feb 19, 2026
Viaarxiv icon

Maximizing Local Entropy Where It Matters: Prefix-Aware Localized LLM Unlearning

Add code
Jan 06, 2026
Viaarxiv icon