Picture for Ivan Evtimov

Ivan Evtimov

Jack

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Safety Alignment of LMs via Non-cooperative Games

Add code
Dec 23, 2025
Viaarxiv icon

RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection

Add code
Oct 06, 2025
Viaarxiv icon

WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

Add code
Apr 30, 2025
Viaarxiv icon

AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents

Add code
Mar 12, 2025
Figure 1 for AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Figure 2 for AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Figure 3 for AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Figure 4 for AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Viaarxiv icon

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

Add code
Dec 13, 2024
Figure 1 for AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Figure 2 for AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Figure 3 for AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Figure 4 for AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Viaarxiv icon

Persistent Pre-Training Poisoning of LLMs

Add code
Oct 17, 2024
Viaarxiv icon

Gradient-based Jailbreak Images for Multimodal Fusion Models

Add code
Oct 04, 2024
Figure 1 for Gradient-based Jailbreak Images for Multimodal Fusion Models
Figure 2 for Gradient-based Jailbreak Images for Multimodal Fusion Models
Figure 3 for Gradient-based Jailbreak Images for Multimodal Fusion Models
Figure 4 for Gradient-based Jailbreak Images for Multimodal Fusion Models
Viaarxiv icon

Automated Red Teaming with GOAT: the Generative Offensive Agent Tester

Add code
Oct 02, 2024
Figure 1 for Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Figure 2 for Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Figure 3 for Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Figure 4 for Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon