Alert button
Picture for Jakob Foerster

Jakob Foerster

Alert button

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

Viaarxiv icon

Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

Add code
Bookmark button
Alert button
Apr 10, 2024
Linas Nasvytis, Kai Sandbrink, Jakob Foerster, Tim Franzmeyer, Christian Schroeder de Witt

Viaarxiv icon

Policy-Guided Diffusion

Add code
Bookmark button
Alert button
Apr 09, 2024
Matthew Thomas Jackson, Michael Tryfan Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob Foerster

Viaarxiv icon

JaxUED: A simple and useable UED library in Jax

Add code
Bookmark button
Alert button
Mar 19, 2024
Samuel Coward, Michael Beukman, Jakob Foerster

Figure 1 for JaxUED: A simple and useable UED library in Jax
Figure 2 for JaxUED: A simple and useable UED library in Jax
Figure 3 for JaxUED: A simple and useable UED library in Jax
Figure 4 for JaxUED: A simple and useable UED library in Jax
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Bookmark button
Alert button
Feb 26, 2024
Mikayel Samvelyan, Sharath Chandra Raparthy, Andrei Lupu, Eric Hambro, Aram H. Markosyan, Manish Bhatt, Yuning Mao, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Tim Rocktäschel, Roberta Raileanu

Viaarxiv icon

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 26, 2024
Michael Matthews, Michael Beukman, Benjamin Ellis, Mikayel Samvelyan, Matthew Jackson, Samuel Coward, Jakob Foerster

Viaarxiv icon

Refining Minimax Regret for Unsupervised Environment Design

Add code
Bookmark button
Alert button
Feb 19, 2024
Michael Beukman, Samuel Coward, Michael Matthews, Mattie Fellows, Minqi Jiang, Michael Dennis, Jakob Foerster

Viaarxiv icon

Symmetry-Breaking Augmentations for Ad Hoc Teamwork

Add code
Bookmark button
Alert button
Feb 15, 2024
Ravi Hammond, Dustin Craggs, Mingyu Guo, Jakob Foerster, Ian Reid

Viaarxiv icon

Revisiting Recurrent Reinforcement Learning with Memory Monoids

Add code
Bookmark button
Alert button
Feb 15, 2024
Steven Morad, Chris Lu, Ryan Kortvelesy, Stephan Liwicki, Jakob Foerster, Amanda Prorok

Viaarxiv icon

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Add code
Bookmark button
Alert button
Feb 13, 2024
Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro

Viaarxiv icon