Picture for Hangoo Kang

Hangoo Kang

TRAP: Targeted Redirecting of Agentic Preferences

Add code
May 29, 2025
Viaarxiv icon

Learning a Pessimistic Reward Model in RLHF

Add code
May 26, 2025
Viaarxiv icon

Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment

Add code
Nov 05, 2024
Viaarxiv icon

Improving LLM Code Generation with Grammar Augmentation

Add code
Mar 03, 2024
Viaarxiv icon