Picture for Yige Yuan

Yige Yuan

From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment

Add code
Jun 14, 2025
Viaarxiv icon

Incentivizing Strong Reasoning from Weak Supervision

Add code
May 28, 2025
Viaarxiv icon

Inference-time Alignment in Continuous Space

Add code
May 26, 2025
Viaarxiv icon

Incentivizing Reasoning from Weak Supervision

Add code
May 26, 2025
Viaarxiv icon

InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning

Add code
May 07, 2025
Viaarxiv icon

On a Connection Between Imitation Learning and RLHF

Add code
Mar 07, 2025
Viaarxiv icon

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

Add code
Feb 28, 2025
Viaarxiv icon

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Add code
Feb 04, 2025
Figure 1 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 2 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 3 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 4 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Viaarxiv icon

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

Add code
Dec 19, 2024
Viaarxiv icon

Fact-Level Confidence Calibration and Self-Correction

Add code
Nov 20, 2024
Figure 1 for Fact-Level Confidence Calibration and Self-Correction
Figure 2 for Fact-Level Confidence Calibration and Self-Correction
Figure 3 for Fact-Level Confidence Calibration and Self-Correction
Figure 4 for Fact-Level Confidence Calibration and Self-Correction
Viaarxiv icon