Picture for Chao Yu

Chao Yu

Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei, China, Shanghai Branch, CAS Center for Excellence in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai, China, Shanghai Research Center for Quantum Sciences, Shanghai, China

Automatic Reward Shaping from Multi-Objective Human Heuristics

Add code
Dec 17, 2025
Viaarxiv icon

Context-Picker: Dynamic context selection using multi-stage reinforcement learning

Add code
Dec 16, 2025
Viaarxiv icon

$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Add code
Oct 29, 2025
Figure 1 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 2 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 3 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 4 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Viaarxiv icon

RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation

Add code
Sep 19, 2025
Figure 1 for RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Figure 2 for RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Figure 3 for RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Figure 4 for RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Viaarxiv icon

H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents

Add code
Sep 16, 2025
Figure 1 for H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents
Figure 2 for H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents
Figure 3 for H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents
Figure 4 for H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance

Add code
Jul 30, 2025
Figure 1 for Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Figure 2 for Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Figure 3 for Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Figure 4 for Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Viaarxiv icon

Exploring the Secondary Risks of Large Language Models

Add code
Jun 14, 2025
Figure 1 for Exploring the Secondary Risks of Large Language Models
Figure 2 for Exploring the Secondary Risks of Large Language Models
Figure 3 for Exploring the Secondary Risks of Large Language Models
Figure 4 for Exploring the Secondary Risks of Large Language Models
Viaarxiv icon

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Add code
May 29, 2025
Figure 1 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 2 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 3 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 4 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Viaarxiv icon

What Can RL Bring to VLA Generalization? An Empirical Study

Add code
May 26, 2025
Viaarxiv icon