Picture for Fuxiang Zhang

Fuxiang Zhang

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Add code
Jul 02, 2025
Viaarxiv icon

Skywork Open Reasoner 1 Technical Report

Add code
May 29, 2025
Viaarxiv icon

Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models

Add code
Jul 04, 2024
Figure 1 for Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
Figure 2 for Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
Figure 3 for Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
Figure 4 for Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
Viaarxiv icon

Q-Adapter: Training Your LLM Adapter as a Residual Q-Function

Add code
Jul 04, 2024
Viaarxiv icon

Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation

Add code
Mar 12, 2024
Figure 1 for Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Figure 2 for Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Figure 3 for Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Figure 4 for Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Viaarxiv icon

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

Add code
Jun 11, 2023
Figure 1 for Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Figure 2 for Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Figure 3 for Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Figure 4 for Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Viaarxiv icon

Multi-agent Continual Coordination via Progressive Task Contextualization

Add code
May 07, 2023
Viaarxiv icon