Picture for Yuandong Tian

Yuandong Tian

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

Add code
Mar 12, 2026
Viaarxiv icon

AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications

Add code
Feb 26, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

STEM: Scaling Transformers with Embedding Modules

Add code
Jan 15, 2026
Viaarxiv icon

The Path Not Taken: RLVR Provably Learns Off the Principals

Add code
Nov 11, 2025
Viaarxiv icon

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Add code
Oct 10, 2025
Viaarxiv icon

Positional Encoding via Token-Aware Phase Attention

Add code
Sep 16, 2025
Viaarxiv icon

Language Self-Play For Data-Free Training

Add code
Sep 09, 2025
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection

Add code
Apr 29, 2025
Figure 1 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 2 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 3 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Figure 4 for GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
Viaarxiv icon