Picture for Xiaoyong Zhu

Xiaoyong Zhu

QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems

Add code
Dec 18, 2025
Viaarxiv icon

Enabling Agents to Communicate Entirely in Latent Space

Add code
Nov 12, 2025
Viaarxiv icon

REVISION:Reflective Intent Mining and Online Reasoning Auxiliary for E-commerce Visual Search System Optimization

Add code
Oct 26, 2025
Viaarxiv icon

MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models

Add code
Jun 24, 2025
Viaarxiv icon

USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models

Add code
May 26, 2025
Figure 1 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 2 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 3 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 4 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Viaarxiv icon

Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models

Add code
May 26, 2025
Figure 1 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 2 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 3 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 4 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Viaarxiv icon

GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning

Add code
Apr 17, 2025
Figure 1 for GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning
Figure 2 for GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning
Figure 3 for GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning
Figure 4 for GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning
Viaarxiv icon

MMKB-RAG: A Multi-Modal Knowledge-Based Retrieval-Augmented Generation Framework

Add code
Apr 15, 2025
Viaarxiv icon

HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States

Add code
Feb 21, 2025
Viaarxiv icon

ChineseSimpleVQA -- "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Add code
Feb 19, 2025
Viaarxiv icon