Picture for Michael Schmatz

Michael Schmatz

Improving Methodologies for Agentic Evaluations Across Domains: Leakage of Sensitive Information, Fraud and Cybersecurity Threats

Add code
Jan 22, 2026
Viaarxiv icon

RepliBench: Evaluating the autonomous replication capabilities of language model agents

Add code
Apr 21, 2025
Viaarxiv icon