Picture for Tonghe Zhang

Tonghe Zhang

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning

Add code
Nov 06, 2025
Viaarxiv icon

$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Add code
Oct 29, 2025
Figure 1 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 2 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 3 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Figure 4 for $π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
Viaarxiv icon

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Add code
May 29, 2025
Figure 1 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 2 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 3 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Figure 4 for ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Viaarxiv icon

Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands

Add code
Feb 26, 2025
Figure 1 for Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands
Figure 2 for Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands
Figure 3 for Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands
Figure 4 for Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands
Viaarxiv icon

Provably Efficient Partially Observable Risk-Sensitive Reinforcement Learning with Hindsight Observation

Add code
Feb 28, 2024
Viaarxiv icon