Picture for Ke Zeng

Ke Zeng

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Add code
Mar 17, 2026
Viaarxiv icon

From $\boldsymbol{\logπ}$ to $\boldsymbolπ$: Taming Divergence in Soft Clipping via Bilateral Decoupled Decay of Probability Gradient Weight

Add code
Mar 15, 2026
Viaarxiv icon

Harmonizing Dense and Sparse Signals in Multi-turn RL: Dual-Horizon Credit Assignment for Industrial Sales Agents

Add code
Mar 02, 2026
Viaarxiv icon

Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems

Add code
Mar 01, 2026
Viaarxiv icon

How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization

Add code
Feb 22, 2026
Viaarxiv icon

MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning

Add code
Feb 19, 2026
Viaarxiv icon

TRIP-Bench: A Benchmark for Long-Horizon Interactive Agents in Real-World Scenarios

Add code
Feb 02, 2026
Viaarxiv icon

DIFFA-2: A Practical Diffusion Large Language Model for General Audio Understanding

Add code
Jan 30, 2026
Viaarxiv icon

Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue

Add code
Jan 26, 2026
Viaarxiv icon

Attention-MoA: Enhancing Mixture-of-Agents via Inter-Agent Semantic Attention and Deep Residual Synthesis

Add code
Jan 23, 2026
Viaarxiv icon