Alert button

GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis

Add code
Bookmark button
Alert button
Feb 21, 2024
Yueqi Xie, Minghong Fang, Renjie Pi, Neil Gong

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: