Picture for Pin-Yu Chen

Pin-Yu Chen

Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness

Add code
Jun 28, 2024
Viaarxiv icon

Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis

Add code
Jun 24, 2024
Viaarxiv icon

The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models

Add code
Jun 14, 2024
Figure 1 for The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
Figure 2 for The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
Figure 3 for The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
Figure 4 for The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
Viaarxiv icon

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Add code
Jun 09, 2024
Figure 1 for PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Figure 2 for PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Figure 3 for PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Figure 4 for PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Viaarxiv icon

A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques

Add code
Jun 07, 2024
Viaarxiv icon

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding

Add code
Jun 04, 2024
Viaarxiv icon

RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection

Add code
May 30, 2024
Viaarxiv icon

AI Risk Management Should Incorporate Both Safety and Security

Add code
May 29, 2024
Figure 1 for AI Risk Management Should Incorporate Both Safety and Security
Viaarxiv icon

Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models

Add code
May 28, 2024
Figure 1 for Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Figure 2 for Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Figure 3 for Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Figure 4 for Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Viaarxiv icon

A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts

Add code
May 28, 2024
Viaarxiv icon