Picture for Jinhua Zhu

Jinhua Zhu

Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks

Add code
May 19, 2025
Viaarxiv icon

Bias Fitting to Mitigate Length Bias of Reward Model in RLHF

Add code
May 19, 2025
Viaarxiv icon

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Add code
Mar 18, 2025
Viaarxiv icon

Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms

Add code
Feb 05, 2025
Viaarxiv icon

Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling

Add code
Feb 02, 2025
Figure 1 for Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling
Figure 2 for Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling
Figure 3 for Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling
Figure 4 for Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling
Viaarxiv icon

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?

Add code
Nov 19, 2024
Figure 1 for BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Figure 2 for BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Figure 3 for BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Figure 4 for BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Viaarxiv icon

Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning

Add code
Oct 22, 2024
Figure 1 for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Figure 2 for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Figure 3 for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Figure 4 for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Viaarxiv icon

Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors

Add code
Jul 21, 2024
Viaarxiv icon

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization

Add code
Jun 09, 2024
Figure 1 for 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization
Figure 2 for 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization
Figure 3 for 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization
Figure 4 for 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization
Viaarxiv icon

FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

Add code
Apr 07, 2024
Viaarxiv icon