Picture for Minlie Huang

Minlie Huang

EJ

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Add code
May 28, 2026
Viaarxiv icon

You Live More Than Once: Towards Hierarchical Skill Meta-Evolving

Add code
May 27, 2026
Viaarxiv icon

EVA: Editing for Versatile Alignment against Jailbreaks

Add code
May 14, 2026
Viaarxiv icon

HoWToBench: Holistic Evaluation for LLM's Capability in Human-level Writing using Tree of Writing

Add code
Apr 21, 2026
Viaarxiv icon

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Add code
Apr 13, 2026
Viaarxiv icon

SMSP: A Plug-and-Play Strategy of Multi-Scale Perception for MLLMs to Perceive Visual Illusions

Add code
Mar 24, 2026
Viaarxiv icon

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Add code
Mar 05, 2026
Viaarxiv icon

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

Add code
Mar 05, 2026
Viaarxiv icon

RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis

Add code
Feb 28, 2026
Viaarxiv icon

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

Add code
Feb 28, 2026
Viaarxiv icon