Picture for Tiantian Fan

Tiantian Fan

Virtual Width Networks

Add code
Nov 17, 2025
Figure 1 for Virtual Width Networks
Figure 2 for Virtual Width Networks
Figure 3 for Virtual Width Networks
Figure 4 for Virtual Width Networks
Viaarxiv icon

Truncated Proximal Policy Optimization

Add code
Jun 18, 2025
Figure 1 for Truncated Proximal Policy Optimization
Figure 2 for Truncated Proximal Policy Optimization
Figure 3 for Truncated Proximal Policy Optimization
Figure 4 for Truncated Proximal Policy Optimization
Viaarxiv icon

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Add code
Mar 18, 2025
Figure 1 for DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Figure 2 for DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Figure 3 for DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Figure 4 for DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Viaarxiv icon

What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret

Add code
Mar 03, 2025
Figure 1 for What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
Figure 2 for What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
Figure 3 for What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
Figure 4 for What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
Viaarxiv icon

Knowing Where and What: Unified Word Block Pretraining for Document Understanding

Add code
Jul 29, 2022
Figure 1 for Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Figure 2 for Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Figure 3 for Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Figure 4 for Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Viaarxiv icon