Picture for Chunliang Zhang

Chunliang Zhang

Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models

Add code
Nov 16, 2025
Viaarxiv icon

MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization

Add code
Oct 24, 2025
Figure 1 for MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Figure 2 for MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Figure 3 for MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Figure 4 for MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Viaarxiv icon

HEAL: A Hypothesis-Based Preference-Aware Analysis Framework

Add code
Aug 27, 2025
Figure 1 for HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
Figure 2 for HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
Figure 3 for HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
Figure 4 for HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
Viaarxiv icon

LaTeXTrans: Structured LaTeX Translation with Multi-Agent Coordination

Add code
Aug 26, 2025
Viaarxiv icon

GRAM: A Generative Foundation Reward Model for Reward Generalization

Add code
Jun 18, 2025
Viaarxiv icon

LRHP: Learning Representations for Human Preferences via Preference Pairs

Add code
Oct 06, 2024
Figure 1 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 2 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 3 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 4 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Viaarxiv icon

RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data

Add code
Aug 22, 2024
Figure 1 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 2 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 3 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 4 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Viaarxiv icon

Revisiting Interpolation Augmentation for Speech-to-Text Generation

Add code
Jun 22, 2024
Figure 1 for Revisiting Interpolation Augmentation for Speech-to-Text Generation
Figure 2 for Revisiting Interpolation Augmentation for Speech-to-Text Generation
Figure 3 for Revisiting Interpolation Augmentation for Speech-to-Text Generation
Figure 4 for Revisiting Interpolation Augmentation for Speech-to-Text Generation
Viaarxiv icon

Prior Constraints-based Reward Model Training for Aligning Large Language Models

Add code
Apr 01, 2024
Figure 1 for Prior Constraints-based Reward Model Training for Aligning Large Language Models
Figure 2 for Prior Constraints-based Reward Model Training for Aligning Large Language Models
Figure 3 for Prior Constraints-based Reward Model Training for Aligning Large Language Models
Figure 4 for Prior Constraints-based Reward Model Training for Aligning Large Language Models
Viaarxiv icon

Large Language Models are Parallel Multilingual Learners

Add code
Mar 14, 2024
Viaarxiv icon