Picture for Lin Yan

Lin Yan

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Add code
Apr 08, 2025
Viaarxiv icon

A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization

Add code
Apr 07, 2025
Viaarxiv icon

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Add code
Mar 31, 2025
Viaarxiv icon

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Add code
Mar 18, 2025
Viaarxiv icon

What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret

Add code
Mar 03, 2025
Viaarxiv icon

Flaming-hot Initiation with Regular Execution Sampling for Large Language Models

Add code
Oct 28, 2024
Figure 1 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 2 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 3 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 4 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Viaarxiv icon

Process Supervision-Guided Policy Optimization for Code Generation

Add code
Oct 23, 2024
Figure 1 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 2 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 3 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 4 for Process Supervision-Guided Policy Optimization for Code Generation
Viaarxiv icon

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

Add code
Oct 11, 2024
Viaarxiv icon

TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones

Add code
Jul 28, 2023
Figure 1 for TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones
Figure 2 for TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones
Figure 3 for TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones
Figure 4 for TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones
Viaarxiv icon

Multilevel Robustness for 2D Vector Field Feature Tracking, Selection, and Comparison

Add code
Sep 19, 2022
Figure 1 for Multilevel Robustness for 2D Vector Field Feature Tracking, Selection, and Comparison
Figure 2 for Multilevel Robustness for 2D Vector Field Feature Tracking, Selection, and Comparison
Figure 3 for Multilevel Robustness for 2D Vector Field Feature Tracking, Selection, and Comparison
Figure 4 for Multilevel Robustness for 2D Vector Field Feature Tracking, Selection, and Comparison
Viaarxiv icon