Picture for Yuandong Tian

Yuandong Tian

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Add code
Oct 10, 2025
Viaarxiv icon

Positional Encoding via Token-Aware Phase Attention

Add code
Sep 16, 2025
Viaarxiv icon

Language Self-Play For Data-Free Training

Add code
Sep 09, 2025
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection

Add code
Apr 29, 2025
Figure 1 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 2 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 3 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 4 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Viaarxiv icon

R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference

Add code
Apr 28, 2025
Viaarxiv icon

Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Add code
Apr 23, 2025
Viaarxiv icon

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Add code
Mar 19, 2025
Figure 1 for SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
Figure 2 for SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
Figure 3 for SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
Figure 4 for SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
Viaarxiv icon

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

Add code
Feb 18, 2025
Viaarxiv icon

Spectral Journey: How Transformers Predict the Shortest Path

Add code
Feb 12, 2025
Viaarxiv icon