Picture for Shaohan Huang

Shaohan Huang

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

Add code
Apr 10, 2026
Viaarxiv icon

Universal YOCO for Efficient Depth Scaling

Add code
Apr 01, 2026
Viaarxiv icon

Online Experiential Learning for Language Models

Add code
Mar 17, 2026
Viaarxiv icon

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

Add code
Mar 08, 2026
Viaarxiv icon

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

Add code
Mar 08, 2026
Viaarxiv icon

SlideSparse: Fast and Flexible (2N-2):2N Structured Sparsity

Add code
Mar 05, 2026
Viaarxiv icon

Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity

Add code
Mar 05, 2026
Viaarxiv icon

On-Policy Context Distillation for Language Models

Add code
Feb 12, 2026
Viaarxiv icon

VIBEVOICE-ASR Technical Report

Add code
Jan 26, 2026
Viaarxiv icon

LLM-in-Sandbox Elicits General Agentic Intelligence

Add code
Jan 22, 2026
Viaarxiv icon