Picture for Shouhong Ding

Shouhong Ding

VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference

Add code
Aug 25, 2025
Viaarxiv icon

MS-DETR: Towards Effective Video Moment Retrieval and Highlight Detection by Joint Motion-Semantic Learning

Add code
Jul 16, 2025
Viaarxiv icon

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Add code
Jul 03, 2025
Viaarxiv icon

Guard Me If You Know Me: Protecting Specific Face-Identity from Deepfakes

Add code
May 26, 2025
Viaarxiv icon

Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable

Add code
May 20, 2025
Viaarxiv icon

Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception

Add code
Apr 29, 2025
Viaarxiv icon

All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning

Add code
Apr 02, 2025
Viaarxiv icon

Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion

Add code
Apr 01, 2025
Viaarxiv icon

Exploring the Collaborative Advantage of Low-level Information on Generalizable AI-Generated Image Detection

Add code
Apr 01, 2025
Viaarxiv icon

ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts

Add code
Apr 01, 2025
Figure 1 for ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Figure 2 for ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Figure 3 for ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Figure 4 for ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Viaarxiv icon