Picture for Xiaofei Zhou

Xiaofei Zhou

VisualActBench: Can VLMs See and Act like a Human?

Add code
Dec 10, 2025
Viaarxiv icon

NP-LoRA: Null Space Projection Unifies Subject and Style in LoRA Fusion

Add code
Nov 14, 2025
Viaarxiv icon

SAM-DAQ: Segment Anything Model with Depth-guided Adaptive Queries for RGB-D Video Salient Object Detection

Add code
Nov 13, 2025
Figure 1 for SAM-DAQ: Segment Anything Model with Depth-guided Adaptive Queries for RGB-D Video Salient Object Detection
Figure 2 for SAM-DAQ: Segment Anything Model with Depth-guided Adaptive Queries for RGB-D Video Salient Object Detection
Figure 3 for SAM-DAQ: Segment Anything Model with Depth-guided Adaptive Queries for RGB-D Video Salient Object Detection
Figure 4 for SAM-DAQ: Segment Anything Model with Depth-guided Adaptive Queries for RGB-D Video Salient Object Detection
Viaarxiv icon

Divide-and-Conquer Decoupled Network for Cross-Domain Few-Shot Segmentation

Add code
Nov 11, 2025
Viaarxiv icon

WXSOD: A Benchmark for Robust Salient Object Detection in Adverse Weather Conditions

Add code
Aug 17, 2025
Viaarxiv icon

Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease

Add code
Jul 16, 2025
Figure 1 for Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease
Figure 2 for Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease
Figure 3 for Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease
Figure 4 for Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Figure 1 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 2 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 3 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 4 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Viaarxiv icon

Jailbreak Large Vision-Language Models Through Multi-Modal Linkage

Add code
Dec 03, 2024
Figure 1 for Jailbreak Large Vision-Language Models Through Multi-Modal Linkage
Figure 2 for Jailbreak Large Vision-Language Models Through Multi-Modal Linkage
Figure 3 for Jailbreak Large Vision-Language Models Through Multi-Modal Linkage
Figure 4 for Jailbreak Large Vision-Language Models Through Multi-Modal Linkage
Viaarxiv icon

MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects

Add code
May 25, 2024
Figure 1 for MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects
Figure 2 for MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects
Figure 3 for MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects
Figure 4 for MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects
Viaarxiv icon

Quality-aware Selective Fusion Network for V-D-T Salient Object Detection

Add code
May 13, 2024
Viaarxiv icon