Picture for Huayu Chen

Huayu Chen

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Add code
May 28, 2025
Viaarxiv icon

Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Add code
May 23, 2025
Viaarxiv icon

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Add code
Mar 18, 2025
Viaarxiv icon

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Add code
Mar 03, 2025
Viaarxiv icon

Exploratory Diffusion Policy for Unsupervised Reinforcement Learning

Add code
Feb 11, 2025
Viaarxiv icon

Process Reinforcement through Implicit Rewards

Add code
Feb 03, 2025
Viaarxiv icon

Visual Generation Without Guidance

Add code
Jan 26, 2025
Figure 1 for Visual Generation Without Guidance
Figure 2 for Visual Generation Without Guidance
Figure 3 for Visual Generation Without Guidance
Figure 4 for Visual Generation Without Guidance
Viaarxiv icon

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Figure 1 for Free Process Rewards without Process Labels
Figure 2 for Free Process Rewards without Process Labels
Figure 3 for Free Process Rewards without Process Labels
Figure 4 for Free Process Rewards without Process Labels
Viaarxiv icon

Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

Add code
Oct 12, 2024
Figure 1 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 2 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 3 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 4 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Viaarxiv icon

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Add code
Oct 10, 2024
Figure 1 for RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Figure 2 for RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Figure 3 for RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Figure 4 for RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Viaarxiv icon