Picture for Toby Shevlane

Toby Shevlane

A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI

Add code
Apr 23, 2024
Figure 1 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 2 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 3 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Figure 4 for A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Viaarxiv icon

Evaluating Frontier Models for Dangerous Capabilities

Add code
Mar 20, 2024
Figure 1 for Evaluating Frontier Models for Dangerous Capabilities
Figure 2 for Evaluating Frontier Models for Dangerous Capabilities
Figure 3 for Evaluating Frontier Models for Dangerous Capabilities
Figure 4 for Evaluating Frontier Models for Dangerous Capabilities
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Model evaluation for extreme risks

Add code
May 24, 2023
Figure 1 for Model evaluation for extreme risks
Figure 2 for Model evaluation for extreme risks
Figure 3 for Model evaluation for extreme risks
Figure 4 for Model evaluation for extreme risks
Viaarxiv icon

Structured access to AI capabilities: an emerging paradigm for safe AI deployment

Add code
Jan 13, 2022
Figure 1 for Structured access to AI capabilities: an emerging paradigm for safe AI deployment
Viaarxiv icon

The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse?

Add code
Jan 09, 2020
Viaarxiv icon