Alert button

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Add code
Bookmark button
Alert button
Nov 02, 2023
Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: