Alert button
Picture for Rusheb Shah

Rusheb Shah

Alert button

Imperial College London

Structured World Representations in Maze-Solving Transformers

Add code
Bookmark button
Alert button
Dec 05, 2023
Michael Igorevich Ivanitskiy, Alex F. Spies, Tilman Räuker, Guillaume Corlouer, Chris Mathwin, Lucia Quirke, Can Rager, Rusheb Shah, Dan Valentine, Cecilia Diniz Behn, Katsumi Inoue, Samy Wu Fung

Viaarxiv icon

Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation

Add code
Bookmark button
Alert button
Nov 06, 2023
Rusheb Shah, Quentin Feuillade--Montixi, Soroush Pour, Arush Tagade, Stephen Casper, Javier Rando

Viaarxiv icon

A Configurable Library for Generating and Manipulating Maze Datasets

Add code
Bookmark button
Alert button
Sep 19, 2023
Michael Igorevich Ivanitskiy, Rusheb Shah, Alex F. Spies, Tilman Räuker, Dan Valentine, Can Rager, Lucia Quirke, Chris Mathwin, Guillaume Corlouer, Cecilia Diniz Behn, Samy Wu Fung

Viaarxiv icon