Picture for Yuan Tian

Yuan Tian

Peter

Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating

Add code
Jun 08, 2026
Viaarxiv icon

RogueMerge: Robust and Unified Attacks against LLM Model Merging

Add code
Jun 02, 2026
Viaarxiv icon

HiSE: A Lightweight Hierarchical Semantic Explainer for Heterogeneous Graph Neural Networks

Add code
Jun 02, 2026
Viaarxiv icon

Enhancing Multi-Agent Communication through Attention Steering with Context Relevance

Add code
May 28, 2026
Viaarxiv icon

When Think-with-Image Meets Safety: What Determines Multimodal Jailbreak Robustness?

Add code
May 27, 2026
Viaarxiv icon

What-If World: A Causal Benchmark for General World Models in Embodied Scenarios

Add code
May 26, 2026
Viaarxiv icon

Cesarean Scar Defect Segmentation in Transvaginal Ultrasound Images: a Dataset and Benchmark

Add code
May 26, 2026
Viaarxiv icon

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction

Add code
May 25, 2026
Viaarxiv icon

HIDBench: Benchmarking Large Language Models for Host-Based Intrusion Detection

Add code
May 20, 2026
Viaarxiv icon

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

Add code
May 13, 2026
Viaarxiv icon