Picture for Shane Legg

Shane Legg

The Incentives that Shape Behaviour

Add code
Jan 20, 2020
Figure 1 for The Incentives that Shape Behaviour
Figure 2 for The Incentives that Shape Behaviour
Figure 3 for The Incentives that Shape Behaviour
Figure 4 for The Incentives that Shape Behaviour
Viaarxiv icon

Learning Human Objectives by Evaluating Hypothetical Behavior

Add code
Dec 05, 2019
Figure 1 for Learning Human Objectives by Evaluating Hypothetical Behavior
Figure 2 for Learning Human Objectives by Evaluating Hypothetical Behavior
Figure 3 for Learning Human Objectives by Evaluating Hypothetical Behavior
Figure 4 for Learning Human Objectives by Evaluating Hypothetical Behavior
Viaarxiv icon

Modeling AGI Safety Frameworks with Causal Influence Diagrams

Add code
Jun 20, 2019
Figure 1 for Modeling AGI Safety Frameworks with Causal Influence Diagrams
Figure 2 for Modeling AGI Safety Frameworks with Causal Influence Diagrams
Figure 3 for Modeling AGI Safety Frameworks with Causal Influence Diagrams
Figure 4 for Modeling AGI Safety Frameworks with Causal Influence Diagrams
Viaarxiv icon

Meta-learning of Sequential Strategies

Add code
May 08, 2019
Figure 1 for Meta-learning of Sequential Strategies
Figure 2 for Meta-learning of Sequential Strategies
Figure 3 for Meta-learning of Sequential Strategies
Figure 4 for Meta-learning of Sequential Strategies
Viaarxiv icon

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings

Add code
Mar 12, 2019
Figure 1 for Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Figure 2 for Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Figure 3 for Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Figure 4 for Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Viaarxiv icon

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

Add code
Jan 08, 2019
Viaarxiv icon

Scaling shared model governance via model splitting

Add code
Dec 14, 2018
Figure 1 for Scaling shared model governance via model splitting
Figure 2 for Scaling shared model governance via model splitting
Figure 3 for Scaling shared model governance via model splitting
Figure 4 for Scaling shared model governance via model splitting
Viaarxiv icon

Scalable agent alignment via reward modeling: a research direction

Add code
Nov 19, 2018
Figure 1 for Scalable agent alignment via reward modeling: a research direction
Figure 2 for Scalable agent alignment via reward modeling: a research direction
Figure 3 for Scalable agent alignment via reward modeling: a research direction
Figure 4 for Scalable agent alignment via reward modeling: a research direction
Viaarxiv icon

Reward learning from human preferences and demonstrations in Atari

Add code
Nov 15, 2018
Figure 1 for Reward learning from human preferences and demonstrations in Atari
Figure 2 for Reward learning from human preferences and demonstrations in Atari
Figure 3 for Reward learning from human preferences and demonstrations in Atari
Figure 4 for Reward learning from human preferences and demonstrations in Atari
Viaarxiv icon

Modeling Friends and Foes

Add code
Jun 30, 2018
Figure 1 for Modeling Friends and Foes
Figure 2 for Modeling Friends and Foes
Figure 3 for Modeling Friends and Foes
Figure 4 for Modeling Friends and Foes
Viaarxiv icon