Picture for Haon Park

Haon Park

Michael Pokorny

Jailbreaking on Text-to-Video Models via Scene Splitting Strategy

Add code
Sep 26, 2025
Viaarxiv icon

X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates

Add code
Sep 10, 2025
Viaarxiv icon

ObjexMT: Objective Extraction and Metacognitive Calibration for LLM-as-a-Judge under Multi-Turn Jailbreaks

Add code
Aug 23, 2025
Viaarxiv icon

Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models

Add code
Aug 06, 2025
Figure 1 for Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
Figure 2 for Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
Figure 3 for Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
Figure 4 for Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
Viaarxiv icon

sudo rm -rf agentic_security

Add code
Mar 26, 2025
Figure 1 for sudo rm -rf agentic_security
Figure 2 for sudo rm -rf agentic_security
Figure 3 for sudo rm -rf agentic_security
Figure 4 for sudo rm -rf agentic_security
Viaarxiv icon

One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

Add code
Mar 06, 2025
Viaarxiv icon

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

Add code
Feb 10, 2025
Viaarxiv icon

Humanity's Last Exam

Add code
Jan 24, 2025
Viaarxiv icon