Picture for Jaylen Jones

Jaylen Jones

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents

Add code
Feb 09, 2026
Viaarxiv icon

When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents

Add code
Feb 09, 2026
Viaarxiv icon

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Add code
May 28, 2025
Viaarxiv icon

AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts

Add code
Oct 29, 2024
Figure 1 for AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts
Figure 2 for AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts
Figure 3 for AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts
Figure 4 for AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts
Viaarxiv icon

A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

Add code
Feb 18, 2024
Figure 1 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 2 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 3 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 4 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Viaarxiv icon