Picture for Gauthier Gidel

Gauthier Gidel

Joey

Learning diverse attacks on large language models for robust red-teaming and safety tuning

Add code
May 28, 2024
Viaarxiv icon

Efficient Adversarial Training in LLMs with Continuous Attacks

Add code
May 24, 2024
Viaarxiv icon

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Add code
Feb 14, 2024
Viaarxiv icon

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

Add code
Feb 09, 2024
Viaarxiv icon

In-Context Learning Can Re-learn Forbidden Tasks

Add code
Feb 08, 2024
Viaarxiv icon

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats

Add code
Oct 30, 2023
Viaarxiv icon

Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

Add code
Oct 29, 2023
Figure 1 for Proving Linear Mode Connectivity of Neural Networks via Optimal Transport
Figure 2 for Proving Linear Mode Connectivity of Neural Networks via Optimal Transport
Viaarxiv icon

Expected flow networks in stochastic environments and two-player zero-sum games

Add code
Oct 04, 2023
Viaarxiv icon

On the Stability of Iterative Retraining of Generative Models on their own Data

Add code
Oct 03, 2023
Figure 1 for On the Stability of Iterative Retraining of Generative Models on their own Data
Figure 2 for On the Stability of Iterative Retraining of Generative Models on their own Data
Figure 3 for On the Stability of Iterative Retraining of Generative Models on their own Data
Viaarxiv icon

High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise

Add code
Oct 03, 2023
Figure 1 for High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Figure 2 for High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Viaarxiv icon