Picture for Yeyun Gong

Yeyun Gong

From Patches to Trajectories: Privileged Process Supervision for Software-Engineering Agents

Add code
May 21, 2026
Viaarxiv icon

Memory Grafting: Scaling Language Model Pre-training via Offline Conditional Memory

Add code
May 20, 2026
Viaarxiv icon

m3BERT: A Modern, Multi-lingual, Matryoshka Bidirectional Encoder

Add code
May 19, 2026
Viaarxiv icon

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Add code
Feb 09, 2026
Viaarxiv icon

Pull Requests as a Training Signal for Repo-Level Code Editing

Add code
Feb 07, 2026
Viaarxiv icon

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Add code
Feb 02, 2026
Viaarxiv icon

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Add code
Feb 02, 2026
Viaarxiv icon

Sigma-MoE-Tiny Technical Report

Add code
Dec 19, 2025
Figure 1 for Sigma-MoE-Tiny Technical Report
Figure 2 for Sigma-MoE-Tiny Technical Report
Figure 3 for Sigma-MoE-Tiny Technical Report
Figure 4 for Sigma-MoE-Tiny Technical Report
Viaarxiv icon

SIGMA: An AI-Empowered Training Stack on Early-Life Hardware

Add code
Dec 15, 2025
Figure 1 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 2 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 3 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 4 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Viaarxiv icon

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Add code
Oct 09, 2025
Viaarxiv icon