Picture for Fanqi Wan

Fanqi Wan

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Add code
May 23, 2025
Viaarxiv icon

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization

Add code
May 23, 2025
Viaarxiv icon

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

Add code
May 16, 2025
Viaarxiv icon

FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

Add code
Apr 09, 2025
Viaarxiv icon

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Add code
Mar 06, 2025
Viaarxiv icon

Advantage-Guided Distillation for Preference Alignment in Small Language Models

Add code
Feb 25, 2025
Viaarxiv icon

Weighted-Reward Preference Optimization for Implicit Model Fusion

Add code
Dec 04, 2024
Figure 1 for Weighted-Reward Preference Optimization for Implicit Model Fusion
Figure 2 for Weighted-Reward Preference Optimization for Implicit Model Fusion
Figure 3 for Weighted-Reward Preference Optimization for Implicit Model Fusion
Figure 4 for Weighted-Reward Preference Optimization for Implicit Model Fusion
Viaarxiv icon

ProFuser: Progressive Fusion of Large Language Models

Add code
Aug 09, 2024
Figure 1 for ProFuser: Progressive Fusion of Large Language Models
Figure 2 for ProFuser: Progressive Fusion of Large Language Models
Figure 3 for ProFuser: Progressive Fusion of Large Language Models
Figure 4 for ProFuser: Progressive Fusion of Large Language Models
Viaarxiv icon

Self-Evolution Fine-Tuning for Policy Optimization

Add code
Jun 16, 2024
Figure 1 for Self-Evolution Fine-Tuning for Policy Optimization
Figure 2 for Self-Evolution Fine-Tuning for Policy Optimization
Figure 3 for Self-Evolution Fine-Tuning for Policy Optimization
Figure 4 for Self-Evolution Fine-Tuning for Policy Optimization
Viaarxiv icon

BlockPruner: Fine-grained Pruning for Large Language Models

Add code
Jun 15, 2024
Viaarxiv icon