Picture for Yige Yuan

Yige Yuan

Incentivizing Strong Reasoning from Weak Supervision

Add code
May 28, 2025
Viaarxiv icon

Inference-time Alignment in Continuous Space

Add code
May 26, 2025
Viaarxiv icon

Incentivizing Reasoning from Weak Supervision

Add code
May 26, 2025
Viaarxiv icon

InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning

Add code
May 07, 2025
Viaarxiv icon

On a Connection Between Imitation Learning and RLHF

Add code
Mar 07, 2025
Viaarxiv icon

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

Add code
Feb 28, 2025
Viaarxiv icon

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Add code
Feb 04, 2025
Figure 1 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 2 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 3 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 4 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Viaarxiv icon

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

Add code
Dec 19, 2024
Viaarxiv icon

Fact-Level Confidence Calibration and Self-Correction

Add code
Nov 20, 2024
Figure 1 for Fact-Level Confidence Calibration and Self-Correction
Figure 2 for Fact-Level Confidence Calibration and Self-Correction
Figure 3 for Fact-Level Confidence Calibration and Self-Correction
Figure 4 for Fact-Level Confidence Calibration and Self-Correction
Viaarxiv icon

How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective

Add code
Oct 14, 2024
Viaarxiv icon