Picture for David Lindner

David Lindner

Evaluating and Understanding Scheming Propensity in LLM Agents

Add code
Mar 02, 2026
Viaarxiv icon

Frontier Models Can Take Actions at Low Probabilities

Add code
Mar 02, 2026
Viaarxiv icon

Stress-Testing Alignment Audits With Prompt-Level Strategic Deception

Add code
Feb 09, 2026
Viaarxiv icon

Practical challenges of control monitoring in frontier AI deployments

Add code
Dec 15, 2025
Viaarxiv icon

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

Add code
Jul 15, 2025
Figure 1 for Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
Viaarxiv icon

Early Signs of Steganographic Capabilities in Frontier LLMs

Add code
Jul 03, 2025
Viaarxiv icon

Evaluating Frontier Models for Stealth and Situational Awareness

Add code
May 02, 2025
Viaarxiv icon

An Approach to Technical AGI Safety and Security

Add code
Apr 02, 2025
Viaarxiv icon

MISR: Measuring Instrumental Self-Reasoning in Frontier Models

Add code
Dec 05, 2024
Viaarxiv icon

ViSTa Dataset: Do vision-language models understand sequential tasks?

Add code
Nov 21, 2024
Figure 1 for ViSTa Dataset: Do vision-language models understand sequential tasks?
Figure 2 for ViSTa Dataset: Do vision-language models understand sequential tasks?
Figure 3 for ViSTa Dataset: Do vision-language models understand sequential tasks?
Figure 4 for ViSTa Dataset: Do vision-language models understand sequential tasks?
Viaarxiv icon