Picture for Xiaoye Qu

Xiaoye Qu

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Add code
Feb 12, 2026
Viaarxiv icon

New Skills or Sharper Primitives? A Probabilistic Perspective on the Emergence of Reasoning in RLVR

Add code
Feb 09, 2026
Viaarxiv icon

Characterizing, Evaluating, and Optimizing Complex Reasoning

Add code
Feb 09, 2026
Viaarxiv icon

Learning to Reason Faithfully through Step-Level Faithfulness Maximization

Add code
Feb 03, 2026
Viaarxiv icon

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Add code
Feb 03, 2026
Viaarxiv icon

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Add code
Jan 23, 2026
Viaarxiv icon

Toward Efficient Agents: Memory, Tool learning, and Planning

Add code
Jan 20, 2026
Viaarxiv icon

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Add code
Dec 30, 2025
Viaarxiv icon

VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

Add code
Dec 22, 2025
Viaarxiv icon

VideoSSR: Video Self-Supervised Reinforcement Learning

Add code
Nov 09, 2025
Viaarxiv icon