Picture for Sam Toyer

Sam Toyer

A StrongREJECT for Empty Jailbreaks

Add code
Feb 15, 2024
Figure 1 for A StrongREJECT for Empty Jailbreaks
Figure 2 for A StrongREJECT for Empty Jailbreaks
Figure 3 for A StrongREJECT for Empty Jailbreaks
Figure 4 for A StrongREJECT for Empty Jailbreaks
Viaarxiv icon

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Add code
Nov 02, 2023
Figure 1 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 2 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 3 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 4 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Viaarxiv icon

imitation: Clean Imitation Learning Implementations

Add code
Nov 22, 2022
Figure 1 for imitation: Clean Imitation Learning Implementations
Figure 2 for imitation: Clean Imitation Learning Implementations
Figure 3 for imitation: Clean Imitation Learning Implementations
Figure 4 for imitation: Clean Imitation Learning Implementations
Viaarxiv icon

An Empirical Investigation of Representation Learning for Imitation

Add code
May 16, 2022
Figure 1 for An Empirical Investigation of Representation Learning for Imitation
Figure 2 for An Empirical Investigation of Representation Learning for Imitation
Figure 3 for An Empirical Investigation of Representation Learning for Imitation
Figure 4 for An Empirical Investigation of Representation Learning for Imitation
Viaarxiv icon

A Primer on Maximum Causal Entropy Inverse Reinforcement Learning

Add code
Mar 22, 2022
Figure 1 for A Primer on Maximum Causal Entropy Inverse Reinforcement Learning
Viaarxiv icon

DERAIL: Diagnostic Environments for Reward And Imitation Learning

Add code
Dec 02, 2020
Figure 1 for DERAIL: Diagnostic Environments for Reward And Imitation Learning
Figure 2 for DERAIL: Diagnostic Environments for Reward And Imitation Learning
Figure 3 for DERAIL: Diagnostic Environments for Reward And Imitation Learning
Figure 4 for DERAIL: Diagnostic Environments for Reward And Imitation Learning
Viaarxiv icon

The MAGICAL Benchmark for Robust Imitation

Add code
Nov 01, 2020
Figure 1 for The MAGICAL Benchmark for Robust Imitation
Figure 2 for The MAGICAL Benchmark for Robust Imitation
Figure 3 for The MAGICAL Benchmark for Robust Imitation
Figure 4 for The MAGICAL Benchmark for Robust Imitation
Viaarxiv icon

ASNets: Deep Learning for Generalised Planning

Add code
Aug 04, 2019
Figure 1 for ASNets: Deep Learning for Generalised Planning
Figure 2 for ASNets: Deep Learning for Generalised Planning
Figure 3 for ASNets: Deep Learning for Generalised Planning
Figure 4 for ASNets: Deep Learning for Generalised Planning
Viaarxiv icon

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

Add code
Oct 01, 2018
Figure 1 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 2 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 3 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 4 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Viaarxiv icon

Action Schema Networks: Generalised Policies with Deep Learning

Add code
Dec 22, 2017
Figure 1 for Action Schema Networks: Generalised Policies with Deep Learning
Figure 2 for Action Schema Networks: Generalised Policies with Deep Learning
Figure 3 for Action Schema Networks: Generalised Policies with Deep Learning
Viaarxiv icon