Picture for Jiawei Xu

Jiawei Xu

From Ellipsoids to Midair Control of Dynamic Hitches

Add code
Feb 08, 2026
Viaarxiv icon

The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL

Add code
Feb 06, 2026
Viaarxiv icon

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Add code
Feb 05, 2026
Viaarxiv icon

Beyond Precision: Training-Inference Mismatch is an Optimization Problem and Simple LR Scheduling Fixes It

Add code
Feb 02, 2026
Viaarxiv icon

Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline

Add code
Jan 18, 2026
Viaarxiv icon

Trust Region Masking for Long-Horizon LLM Reinforcement Learning

Add code
Dec 28, 2025
Viaarxiv icon

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning

Add code
Dec 28, 2025
Viaarxiv icon

Towards Effective Model Editing for LLM Personalization

Add code
Dec 15, 2025
Viaarxiv icon

LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence

Add code
Sep 03, 2025
Figure 1 for LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence
Figure 2 for LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence
Figure 3 for LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence
Figure 4 for LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence
Viaarxiv icon

Neural Network Training via Stochastic Alternating Minimization with Trainable Step Sizes

Add code
Aug 06, 2025
Viaarxiv icon