Picture for David Dobre

David Dobre

Learning diverse attacks on large language models for robust red-teaming and safety tuning

Add code
May 28, 2024
Viaarxiv icon

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Add code
Feb 14, 2024
Viaarxiv icon

In-Context Learning Can Re-learn Forbidden Tasks

Add code
Feb 08, 2024
Viaarxiv icon

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats

Add code
Oct 30, 2023
Viaarxiv icon

Raising the Bar for Certified Adversarial Robustness with Diffusion Models

Add code
May 17, 2023
Viaarxiv icon

Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features

Add code
Apr 23, 2023
Viaarxiv icon

Dissecting adaptive methods in GANs

Add code
Oct 09, 2022
Figure 1 for Dissecting adaptive methods in GANs
Figure 2 for Dissecting adaptive methods in GANs
Figure 3 for Dissecting adaptive methods in GANs
Figure 4 for Dissecting adaptive methods in GANs
Viaarxiv icon

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

Add code
Jun 02, 2022
Figure 1 for Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
Figure 2 for Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
Figure 3 for Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
Figure 4 for Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
Viaarxiv icon