Picture for Maksym Andriushchenko

Maksym Andriushchenko

Saarland University

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents

Add code
Jun 17, 2025
Viaarxiv icon

Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors

Add code
Jun 12, 2025
Viaarxiv icon

Capability-Based Scaling Laws for LLM Red-Teaming

Add code
May 26, 2025
Viaarxiv icon

Exploring Memorization and Copyright Violation in Frontier LLMs: A Study of the New York Times v. OpenAI 2023 Lawsuit

Add code
Dec 09, 2024
Viaarxiv icon

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Add code
Oct 11, 2024
Figure 1 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 2 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 3 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 4 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Viaarxiv icon

Does Refusal Training in LLMs Generalize to the Past Tense?

Add code
Jul 16, 2024
Viaarxiv icon

Improving Alignment and Robustness with Circuit Breakers

Add code
Jun 10, 2024
Figure 1 for Improving Alignment and Robustness with Circuit Breakers
Figure 2 for Improving Alignment and Robustness with Circuit Breakers
Figure 3 for Improving Alignment and Robustness with Circuit Breakers
Figure 4 for Improving Alignment and Robustness with Circuit Breakers
Viaarxiv icon

Improving Alignment and Robustness with Short Circuiting

Add code
Jun 06, 2024
Figure 1 for Improving Alignment and Robustness with Short Circuiting
Figure 2 for Improving Alignment and Robustness with Short Circuiting
Figure 3 for Improving Alignment and Robustness with Short Circuiting
Figure 4 for Improving Alignment and Robustness with Short Circuiting
Viaarxiv icon

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Add code
May 30, 2024
Figure 1 for Is In-Context Learning Sufficient for Instruction Following in LLMs?
Figure 2 for Is In-Context Learning Sufficient for Instruction Following in LLMs?
Figure 3 for Is In-Context Learning Sufficient for Instruction Following in LLMs?
Figure 4 for Is In-Context Learning Sufficient for Instruction Following in LLMs?
Viaarxiv icon

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Add code
Apr 22, 2024
Viaarxiv icon