Picture for Ruqi Zhang

Ruqi Zhang

Reward-Shifted Speculative Sampling Is An Efficient Test-Time Weak-to-Strong Aligner

Add code
Aug 20, 2025
Viaarxiv icon

ViLaD: A Large Vision Language Diffusion Framework for End-to-End Autonomous Driving

Add code
Aug 18, 2025
Viaarxiv icon

Stacey: Promoting Stochastic Steepest Descent via Accelerated $\ell_p$-Smooth Nonconvex Optimization

Add code
Jun 07, 2025
Viaarxiv icon

Inference Acceleration of Autoregressive Normalizing Flows by Selective Jacobi Decoding

Add code
May 30, 2025
Viaarxiv icon

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Add code
May 28, 2025
Viaarxiv icon

Entropy-Guided Sampling of Flat Modes in Discrete Spaces

Add code
May 05, 2025
Figure 1 for Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Figure 2 for Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Figure 3 for Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Figure 4 for Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Viaarxiv icon

Energy-Based Reward Models for Robust Language Model Alignment

Add code
Apr 17, 2025
Viaarxiv icon

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

Add code
Apr 03, 2025
Figure 1 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 2 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 3 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 4 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Viaarxiv icon

Reheated Gradient-based Discrete Sampling for Combinatorial Optimization

Add code
Mar 06, 2025
Viaarxiv icon

Optimal Stochastic Trace Estimation in Generative Modeling

Add code
Feb 26, 2025
Viaarxiv icon