Picture for Minlie Huang

Minlie Huang

EJ

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

Add code
Mar 05, 2026
Viaarxiv icon

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Add code
Mar 05, 2026
Viaarxiv icon

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

Add code
Feb 28, 2026
Viaarxiv icon

RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis

Add code
Feb 28, 2026
Viaarxiv icon

Grounding LLMs in Scientific Discovery via Embodied Actions

Add code
Feb 24, 2026
Viaarxiv icon

GLM-5: from Vibe Coding to Agentic Engineering

Add code
Feb 17, 2026
Viaarxiv icon

PatientHub: A Unified Framework for Patient Simulation

Add code
Feb 12, 2026
Viaarxiv icon

The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment

Add code
Feb 04, 2026
Viaarxiv icon

PsychePass: Calibrating LLM Therapeutic Competence via Trajectory-Anchored Tournaments

Add code
Jan 28, 2026
Viaarxiv icon

The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning

Add code
Jan 20, 2026
Viaarxiv icon