Picture for Zhiqing Sun

Zhiqing Sun

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Add code
May 16, 2024
Viaarxiv icon

Self-Play Preference Optimization for Language Model Alignment

Add code
May 01, 2024
Viaarxiv icon

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Add code
Apr 02, 2024
Viaarxiv icon

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Add code
Mar 14, 2024
Figure 1 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 2 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 3 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 4 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Viaarxiv icon

HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild

Add code
Mar 07, 2024
Figure 1 for HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild
Figure 2 for HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild
Figure 3 for HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild
Figure 4 for HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild
Viaarxiv icon

Instruction-tuned Language Models are Better Knowledge Learners

Add code
Feb 20, 2024
Viaarxiv icon

Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble

Add code
Jan 30, 2024
Viaarxiv icon

SALMON: Self-Alignment with Principle-Following Reward Models

Add code
Oct 09, 2023
Viaarxiv icon

Aligning Large Multimodal Models with Factually Augmented RLHF

Add code
Sep 25, 2023
Figure 1 for Aligning Large Multimodal Models with Factually Augmented RLHF
Figure 2 for Aligning Large Multimodal Models with Factually Augmented RLHF
Figure 3 for Aligning Large Multimodal Models with Factually Augmented RLHF
Figure 4 for Aligning Large Multimodal Models with Factually Augmented RLHF
Viaarxiv icon

Accelerating Diffusion-based Combinatorial Optimization Solvers by Progressive Distillation

Add code
Aug 22, 2023
Figure 1 for Accelerating Diffusion-based Combinatorial Optimization Solvers by Progressive Distillation
Viaarxiv icon