Picture for Chengzhi Liu

Chengzhi Liu

Agents' Last Exam

Add code
Jun 03, 2026
Viaarxiv icon

WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

Add code
May 28, 2026
Viaarxiv icon

Survive or Collapse: The Asymmetric Roles of Data Gating and Reward Grounding in Self-Play RL

Add code
May 21, 2026
Viaarxiv icon

Auditing Agent Harness Safety

Add code
May 14, 2026
Viaarxiv icon

SafePro: Evaluating the Safety of Professional-Level AI Agents

Add code
Jan 13, 2026
Viaarxiv icon

Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space

Add code
Dec 17, 2025
Viaarxiv icon

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

Add code
May 22, 2025
Viaarxiv icon

Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation

Add code
Mar 07, 2025
Figure 1 for Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation
Figure 2 for Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation
Figure 3 for Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation
Figure 4 for Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation
Viaarxiv icon

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Add code
Feb 18, 2025
Figure 1 for The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
Figure 2 for The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
Figure 3 for The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
Figure 4 for The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
Viaarxiv icon

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon