Picture for Xiaoye Qu

Xiaoye Qu

Characterizing, Evaluating, and Optimizing Complex Reasoning

Add code
Feb 09, 2026
Viaarxiv icon

New Skills or Sharper Primitives? A Probabilistic Perspective on the Emergence of Reasoning in RLVR

Add code
Feb 09, 2026
Viaarxiv icon

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Add code
Feb 03, 2026
Viaarxiv icon

Learning to Reason Faithfully through Step-Level Faithfulness Maximization

Add code
Feb 03, 2026
Viaarxiv icon

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Add code
Jan 23, 2026
Viaarxiv icon

Toward Efficient Agents: Memory, Tool learning, and Planning

Add code
Jan 20, 2026
Viaarxiv icon

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Add code
Dec 30, 2025
Viaarxiv icon

VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

Add code
Dec 22, 2025
Viaarxiv icon

VideoSSR: Video Self-Supervised Reinforcement Learning

Add code
Nov 09, 2025
Viaarxiv icon

ExGRPO: Learning to Reason from Experience

Add code
Oct 02, 2025
Figure 1 for ExGRPO: Learning to Reason from Experience
Figure 2 for ExGRPO: Learning to Reason from Experience
Figure 3 for ExGRPO: Learning to Reason from Experience
Figure 4 for ExGRPO: Learning to Reason from Experience
Viaarxiv icon