Alert button
Picture for Francis Rhys Ward

Francis Rhys Ward

Alert button

The Reasons that Agents Act: Intention and Instrumental Goals

Add code
Bookmark button
Alert button
Feb 15, 2024
Francis Rhys Ward, Matt MacDermott, Francesco Belardinelli, Francesca Toni, Tom Everitt

Viaarxiv icon

Honesty Is the Best Policy: Defining and Mitigating AI Deception

Add code
Bookmark button
Alert button
Dec 03, 2023
Francis Rhys Ward, Francesco Belardinelli, Francesca Toni, Tom Everitt

Viaarxiv icon

Experiments with Detecting and Mitigating AI Deception

Add code
Bookmark button
Alert button
Jun 26, 2023
Ismail Sahbane, Francis Rhys Ward, C Henrik Åslund

Figure 1 for Experiments with Detecting and Mitigating AI Deception
Figure 2 for Experiments with Detecting and Mitigating AI Deception
Figure 3 for Experiments with Detecting and Mitigating AI Deception
Viaarxiv icon

Argumentative Reward Learning: Reasoning About Human Preferences

Add code
Bookmark button
Alert button
Sep 28, 2022
Francis Rhys Ward, Francesco Belardinelli, Francesca Toni

Figure 1 for Argumentative Reward Learning: Reasoning About Human Preferences
Figure 2 for Argumentative Reward Learning: Reasoning About Human Preferences
Figure 3 for Argumentative Reward Learning: Reasoning About Human Preferences
Figure 4 for Argumentative Reward Learning: Reasoning About Human Preferences
Viaarxiv icon