Picture for Kai Xiong

Kai Xiong

DeepTool: Scaling Interleaved Deliberation in Tool-Integrated Reasoning via Process-Supervised Reinforcement Learning

Add code
May 28, 2026
Viaarxiv icon

X-Imitator: Spatial-Aware Imitation Learning via Bidirectional Action-Pose Interaction

Add code
May 12, 2026
Viaarxiv icon

NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking

Add code
Feb 05, 2026
Viaarxiv icon

ARM: Role-Conditioned Neuron Transplantation for Training-Free Generalist LLM Agent Merging

Add code
Jan 12, 2026
Viaarxiv icon

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

Add code
Jan 12, 2026
Viaarxiv icon

MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization

Add code
Jan 12, 2026
Viaarxiv icon

Do LLMs Signal When They're Right? Evidence from Neuron Agreement

Add code
Oct 30, 2025
Viaarxiv icon

PuzzleClone: An SMT-Powered Framework for Synthesizing Verifiable Data

Add code
Aug 21, 2025
Viaarxiv icon

Com$^2$: A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models

Add code
Jun 08, 2025
Viaarxiv icon

Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

Add code
May 27, 2025
Viaarxiv icon