Picture for Tom Everitt

Tom Everitt

DeepMind

A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI

Add code
Apr 23, 2024
Figure 1 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 2 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 3 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 4 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Viaarxiv icon

Robust agents learn causal world models

Add code
Feb 26, 2024
Figure 1 for Robust agents learn causal world models
Figure 2 for Robust agents learn causal world models
Figure 3 for Robust agents learn causal world models
Figure 4 for Robust agents learn causal world models
Viaarxiv icon

The Reasons that Agents Act: Intention and Instrumental Goals

Add code
Feb 15, 2024
Figure 1 for The Reasons that Agents Act: Intention and Instrumental Goals
Figure 2 for The Reasons that Agents Act: Intention and Instrumental Goals
Figure 3 for The Reasons that Agents Act: Intention and Instrumental Goals
Figure 4 for The Reasons that Agents Act: Intention and Instrumental Goals
Viaarxiv icon

Honesty Is the Best Policy: Defining and Mitigating AI Deception

Add code
Dec 03, 2023
Figure 1 for Honesty Is the Best Policy: Defining and Mitigating AI Deception
Figure 2 for Honesty Is the Best Policy: Defining and Mitigating AI Deception
Figure 3 for Honesty Is the Best Policy: Defining and Mitigating AI Deception
Figure 4 for Honesty Is the Best Policy: Defining and Mitigating AI Deception
Viaarxiv icon

Characterising Decision Theories with Mechanised Causal Graphs

Add code
Jul 20, 2023
Figure 1 for Characterising Decision Theories with Mechanised Causal Graphs
Figure 2 for Characterising Decision Theories with Mechanised Causal Graphs
Figure 3 for Characterising Decision Theories with Mechanised Causal Graphs
Figure 4 for Characterising Decision Theories with Mechanised Causal Graphs
Viaarxiv icon

Human Control: Definitions and Algorithms

Add code
May 31, 2023
Figure 1 for Human Control: Definitions and Algorithms
Figure 2 for Human Control: Definitions and Algorithms
Figure 3 for Human Control: Definitions and Algorithms
Figure 4 for Human Control: Definitions and Algorithms
Viaarxiv icon

Reasoning about Causality in Games

Add code
Jan 05, 2023
Figure 1 for Reasoning about Causality in Games
Figure 2 for Reasoning about Causality in Games
Figure 3 for Reasoning about Causality in Games
Figure 4 for Reasoning about Causality in Games
Viaarxiv icon

Discovering Agents

Add code
Aug 24, 2022
Figure 1 for Discovering Agents
Figure 2 for Discovering Agents
Figure 3 for Discovering Agents
Figure 4 for Discovering Agents
Viaarxiv icon

Path-Specific Objectives for Safer Agent Incentives

Add code
Apr 21, 2022
Figure 1 for Path-Specific Objectives for Safer Agent Incentives
Figure 2 for Path-Specific Objectives for Safer Agent Incentives
Figure 3 for Path-Specific Objectives for Safer Agent Incentives
Figure 4 for Path-Specific Objectives for Safer Agent Incentives
Viaarxiv icon

A Complete Criterion for Value of Information in Soluble Influence Diagrams

Add code
Feb 23, 2022
Figure 1 for A Complete Criterion for Value of Information in Soluble Influence Diagrams
Figure 2 for A Complete Criterion for Value of Information in Soluble Influence Diagrams
Figure 3 for A Complete Criterion for Value of Information in Soluble Influence Diagrams
Figure 4 for A Complete Criterion for Value of Information in Soluble Influence Diagrams
Viaarxiv icon