Picture for Yulan Hu

Yulan Hu

Towards Reward Fairness in RLHF: From a Resource Allocation Perspective

Add code
May 29, 2025
Viaarxiv icon

SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin

Add code
Feb 19, 2025
Figure 1 for SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
Figure 2 for SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
Figure 3 for SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
Figure 4 for SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
Viaarxiv icon

Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning

Add code
Jan 23, 2025
Figure 1 for Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning
Figure 2 for Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning
Figure 3 for Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning
Figure 4 for Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning
Viaarxiv icon

Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models

Add code
Nov 25, 2024
Figure 1 for Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Figure 2 for Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Figure 3 for Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Figure 4 for Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Viaarxiv icon

GUNDAM: Aligning Large Language Models with Graph Understanding

Add code
Sep 30, 2024
Figure 1 for GUNDAM: Aligning Large Language Models with Graph Understanding
Figure 2 for GUNDAM: Aligning Large Language Models with Graph Understanding
Figure 3 for GUNDAM: Aligning Large Language Models with Graph Understanding
Figure 4 for GUNDAM: Aligning Large Language Models with Graph Understanding
Viaarxiv icon

TSO: Self-Training with Scaled Preference Optimization

Add code
Aug 31, 2024
Figure 1 for TSO: Self-Training with Scaled Preference Optimization
Figure 2 for TSO: Self-Training with Scaled Preference Optimization
Figure 3 for TSO: Self-Training with Scaled Preference Optimization
Figure 4 for TSO: Self-Training with Scaled Preference Optimization
Viaarxiv icon

Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation

Add code
Jun 25, 2024
Figure 1 for Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation
Figure 2 for Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation
Figure 3 for Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation
Figure 4 for Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation
Viaarxiv icon

Towards Comprehensive Preference Data Collection for Reward Modeling

Add code
Jun 24, 2024
Figure 1 for Towards Comprehensive Preference Data Collection for Reward Modeling
Figure 2 for Towards Comprehensive Preference Data Collection for Reward Modeling
Figure 3 for Towards Comprehensive Preference Data Collection for Reward Modeling
Figure 4 for Towards Comprehensive Preference Data Collection for Reward Modeling
Viaarxiv icon

Exploring Task Unification in Graph Representation Learning via Generative Approach

Add code
Mar 21, 2024
Figure 1 for Exploring Task Unification in Graph Representation Learning via Generative Approach
Figure 2 for Exploring Task Unification in Graph Representation Learning via Generative Approach
Figure 3 for Exploring Task Unification in Graph Representation Learning via Generative Approach
Figure 4 for Exploring Task Unification in Graph Representation Learning via Generative Approach
Viaarxiv icon

VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification

Add code
Nov 02, 2023
Figure 1 for VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification
Figure 2 for VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification
Figure 3 for VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification
Figure 4 for VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification
Viaarxiv icon