Picture for Geoffrey Irving

Geoffrey Irving

Practical challenges of control monitoring in frontier AI deployments

Add code
Dec 15, 2025
Viaarxiv icon

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

Add code
Jul 15, 2025
Figure 1 for Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
Viaarxiv icon

Avoiding Obfuscation with Prover-Estimator Debate

Add code
Jun 16, 2025
Viaarxiv icon

An alignment safety case sketch based on debate

Add code
May 08, 2025
Viaarxiv icon

How to evaluate control measures for LLM agents? A trajectory from today to superintelligence

Add code
Apr 07, 2025
Viaarxiv icon

A sketch of an AI control safety case

Add code
Jan 28, 2025
Figure 1 for A sketch of an AI control safety case
Figure 2 for A sketch of an AI control safety case
Figure 3 for A sketch of an AI control safety case
Figure 4 for A sketch of an AI control safety case
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Scalable AI Safety via Doubly-Efficient Debate

Add code
Nov 23, 2023
Figure 1 for Scalable AI Safety via Doubly-Efficient Debate
Figure 2 for Scalable AI Safety via Doubly-Efficient Debate
Figure 3 for Scalable AI Safety via Doubly-Efficient Debate
Figure 4 for Scalable AI Safety via Doubly-Efficient Debate
Viaarxiv icon

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Add code
Jul 24, 2023
Figure 1 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 2 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 3 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 4 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Viaarxiv icon

Accelerating Large Language Model Decoding with Speculative Sampling

Add code
Feb 02, 2023
Figure 1 for Accelerating Large Language Model Decoding with Speculative Sampling
Figure 2 for Accelerating Large Language Model Decoding with Speculative Sampling
Figure 3 for Accelerating Large Language Model Decoding with Speculative Sampling
Viaarxiv icon