Picture for Xuekai Zhu

Xuekai Zhu

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Add code
Sep 11, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

Towards a Unified View of Large Language Model Post-Training

Add code
Sep 04, 2025
Viaarxiv icon

Reasoning with Exploration: An Entropy Perspective

Add code
Jun 17, 2025
Viaarxiv icon

DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving

Add code
May 22, 2025
Viaarxiv icon

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Add code
May 19, 2025
Viaarxiv icon

TTRL: Test-Time Reinforcement Learning

Add code
Apr 22, 2025
Viaarxiv icon

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Add code
Mar 14, 2025
Figure 1 for Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Figure 2 for Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Figure 3 for Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Figure 4 for Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Viaarxiv icon

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

Add code
Jan 30, 2025
Viaarxiv icon

How to Synthesize Text Data without Model Collapse?

Add code
Dec 19, 2024
Figure 1 for How to Synthesize Text Data without Model Collapse?
Figure 2 for How to Synthesize Text Data without Model Collapse?
Figure 3 for How to Synthesize Text Data without Model Collapse?
Figure 4 for How to Synthesize Text Data without Model Collapse?
Viaarxiv icon