Picture for Yu Zhou

Yu Zhou

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China

When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?

Add code
Nov 13, 2025
Viaarxiv icon

SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition

Add code
Nov 13, 2025
Viaarxiv icon

MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples

Add code
Nov 13, 2025
Viaarxiv icon

Task-Aware 3D Affordance Segmentation via 2D Guidance and Geometric Refinement

Add code
Nov 12, 2025
Viaarxiv icon

An Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human Intervention

Add code
Nov 06, 2025
Viaarxiv icon

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

Add code
Oct 16, 2025
Figure 1 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 2 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 3 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 4 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Viaarxiv icon

DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation

Add code
Oct 16, 2025
Figure 1 for DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Figure 2 for DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Figure 3 for DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Figure 4 for DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Viaarxiv icon

Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach

Add code
Sep 26, 2025
Viaarxiv icon

A Correction for the Paper "Symplectic geometry mode decomposition and its application to rotating machinery compound fault diagnosis"

Add code
Aug 29, 2025
Figure 1 for A Correction for the Paper "Symplectic geometry mode decomposition and its application to rotating machinery compound fault diagnosis"
Figure 2 for A Correction for the Paper "Symplectic geometry mode decomposition and its application to rotating machinery compound fault diagnosis"
Figure 3 for A Correction for the Paper "Symplectic geometry mode decomposition and its application to rotating machinery compound fault diagnosis"
Figure 4 for A Correction for the Paper "Symplectic geometry mode decomposition and its application to rotating machinery compound fault diagnosis"
Viaarxiv icon

PathMR: Multimodal Visual Reasoning for Interpretable Pathology Diagnosis

Add code
Aug 28, 2025
Viaarxiv icon