Picture for Yuancheng Xu

Yuancheng Xu

GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment

Add code
Oct 10, 2024
Figure 1 for GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Figure 2 for GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Figure 3 for GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Figure 4 for GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Viaarxiv icon

SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation

Add code
Oct 03, 2024
Viaarxiv icon

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models

Add code
Sep 01, 2024
Viaarxiv icon

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

Add code
Jul 24, 2024
Viaarxiv icon

Calibrated Dataset Condensation for Faster Hyperparameter Search

Add code
May 27, 2024
Viaarxiv icon

Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models

Add code
Feb 05, 2024
Viaarxiv icon

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Add code
Jan 25, 2024
Viaarxiv icon

Benchmarking the Robustness of Image Watermarks

Add code
Jan 22, 2024
Viaarxiv icon

C-Disentanglement: Discovering Causally-Independent Generative Factors under an Inductive Bias of Confounder

Add code
Oct 26, 2023
Viaarxiv icon

Equal Long-term Benefit Rate: Adapting Static Fairness Notions to Sequential Decision Making

Add code
Sep 07, 2023
Viaarxiv icon