Alert button
Picture for Sophie Xhonneux

Sophie Xhonneux

Alert button

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Add code
Bookmark button
Alert button
Feb 14, 2024
Leo Schwinn, David Dobre, Sophie Xhonneux, Gauthier Gidel, Stephan Gunnemann

Viaarxiv icon

In-Context Learning Can Re-learn Forbidden Tasks

Add code
Bookmark button
Alert button
Feb 08, 2024
Sophie Xhonneux, David Dobre, Jian Tang, Gauthier Gidel, Dhanya Sridhar

Viaarxiv icon