Picture for Jiaxuan Gao

Jiaxuan Gao

How Far Are We from Optimal Reasoning Efficiency?

Add code
Jun 08, 2025
Viaarxiv icon

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Add code
May 30, 2025
Viaarxiv icon

Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps

Add code
May 15, 2025
Viaarxiv icon

Industrial-Grade Sensor Simulation via Gaussian Splatting: A Modular Framework for Scalable Editing and Full-Stack Validation

Add code
Mar 14, 2025
Viaarxiv icon

Few-shot In-Context Preference Learning Using Large Language Models

Add code
Oct 22, 2024
Figure 1 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 2 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 3 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 4 for Few-shot In-Context Preference Learning Using Large Language Models
Viaarxiv icon

On Designing Effective RL Reward at Training Time for LLM Reasoning

Add code
Oct 19, 2024
Figure 1 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 2 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 3 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 4 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Viaarxiv icon

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Add code
Apr 16, 2024
Viaarxiv icon

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination

Add code
Jan 09, 2024
Figure 1 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 2 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 3 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 4 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Viaarxiv icon

Language-Guided Generation of Physically Realistic Robot Motion and Control

Add code
Jun 18, 2023
Figure 1 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 2 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 3 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 4 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Viaarxiv icon

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

Add code
Feb 03, 2023
Figure 1 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 2 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 3 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 4 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Viaarxiv icon