Picture for Jiaxuan Gao

Jiaxuan Gao

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

Add code
Aug 13, 2025
Viaarxiv icon

QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation

Add code
Jul 17, 2025
Viaarxiv icon

How Far Are We from Optimal Reasoning Efficiency?

Add code
Jun 08, 2025
Viaarxiv icon

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Add code
May 30, 2025
Viaarxiv icon

Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps

Add code
May 15, 2025
Viaarxiv icon

Industrial-Grade Sensor Simulation via Gaussian Splatting: A Modular Framework for Scalable Editing and Full-Stack Validation

Add code
Mar 14, 2025
Viaarxiv icon

Few-shot In-Context Preference Learning Using Large Language Models

Add code
Oct 22, 2024
Figure 1 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 2 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 3 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 4 for Few-shot In-Context Preference Learning Using Large Language Models
Viaarxiv icon

On Designing Effective RL Reward at Training Time for LLM Reasoning

Add code
Oct 19, 2024
Figure 1 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 2 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 3 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 4 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Viaarxiv icon

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Add code
Apr 16, 2024
Viaarxiv icon

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination

Add code
Jan 09, 2024
Figure 1 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 2 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 3 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Figure 4 for LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Viaarxiv icon