Picture for An Zhang

An Zhang

Risky-Bench: Probing Agentic Safety Risks under Real-World Deployment

Add code
Feb 03, 2026
Viaarxiv icon

Self-Guard: Defending Large Reasoning Models via enhanced self-reflection

Add code
Jan 31, 2026
Viaarxiv icon

MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Add code
Jan 29, 2026
Viaarxiv icon

Understanding Multilingualism in Mixture-of-Experts LLMs: Routing Mechanism, Expert Specialization, and Layerwise Steering

Add code
Jan 20, 2026
Viaarxiv icon

Quantile Advantage Estimation for Entropy-Safe Reasoning

Add code
Sep 26, 2025
Viaarxiv icon

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

Add code
Jun 11, 2025
Viaarxiv icon

On Reasoning Strength Planning in Large Reasoning Models

Add code
Jun 10, 2025
Viaarxiv icon

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

Add code
Jun 09, 2025
Figure 1 for RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
Figure 2 for RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
Figure 3 for RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
Figure 4 for RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
Viaarxiv icon

AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint

Add code
Jun 08, 2025
Figure 1 for AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint
Figure 2 for AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint
Figure 3 for AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint
Figure 4 for AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint
Viaarxiv icon

AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems

Add code
May 26, 2025
Figure 1 for AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Figure 2 for AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Figure 3 for AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Figure 4 for AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Viaarxiv icon