Picture for Olivia Watkins

Olivia Watkins

A StrongREJECT for Empty Jailbreaks

Add code
Feb 15, 2024
Figure 1 for A StrongREJECT for Empty Jailbreaks
Figure 2 for A StrongREJECT for Empty Jailbreaks
Figure 3 for A StrongREJECT for Empty Jailbreaks
Figure 4 for A StrongREJECT for Empty Jailbreaks
Viaarxiv icon

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Add code
Nov 02, 2023
Figure 1 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 2 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 3 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Figure 4 for Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Viaarxiv icon

Learning to Model the World with Language

Add code
Jul 31, 2023
Figure 1 for Learning to Model the World with Language
Figure 2 for Learning to Model the World with Language
Figure 3 for Learning to Model the World with Language
Figure 4 for Learning to Model the World with Language
Viaarxiv icon

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Add code
May 25, 2023
Figure 1 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 2 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 3 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 4 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Viaarxiv icon

Aligning Text-to-Image Models using Human Feedback

Add code
Feb 23, 2023
Figure 1 for Aligning Text-to-Image Models using Human Feedback
Figure 2 for Aligning Text-to-Image Models using Human Feedback
Figure 3 for Aligning Text-to-Image Models using Human Feedback
Figure 4 for Aligning Text-to-Image Models using Human Feedback
Viaarxiv icon

Guiding Pretraining in Reinforcement Learning with Large Language Models

Add code
Feb 13, 2023
Figure 1 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 2 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 3 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 4 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Viaarxiv icon

Teachable Reinforcement Learning via Advice Distillation

Add code
Mar 19, 2022
Figure 1 for Teachable Reinforcement Learning via Advice Distillation
Figure 2 for Teachable Reinforcement Learning via Advice Distillation
Figure 3 for Teachable Reinforcement Learning via Advice Distillation
Figure 4 for Teachable Reinforcement Learning via Advice Distillation
Viaarxiv icon

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

Add code
Jan 29, 2022
Figure 1 for Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Figure 2 for Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Figure 3 for Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Figure 4 for Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Viaarxiv icon

Auto-Tuned Sim-to-Real Transfer

Add code
Apr 15, 2021
Figure 1 for Auto-Tuned Sim-to-Real Transfer
Figure 2 for Auto-Tuned Sim-to-Real Transfer
Figure 3 for Auto-Tuned Sim-to-Real Transfer
Figure 4 for Auto-Tuned Sim-to-Real Transfer
Viaarxiv icon

Hierarchical Text Generation using an Outline

Add code
Oct 20, 2018
Figure 1 for Hierarchical Text Generation using an Outline
Figure 2 for Hierarchical Text Generation using an Outline
Figure 3 for Hierarchical Text Generation using an Outline
Viaarxiv icon