Picture for Tingshu Mou

Tingshu Mou

ImageAttributionBench: How Far Are We from Generalizable Attribution?

Add code
May 13, 2026
Viaarxiv icon

ViSRA: A Video-based Spatial Reasoning Agent for Multi-modal Large Language Models

Add code
May 11, 2026
Viaarxiv icon

ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack

Add code
Aug 10, 2024
Figure 1 for ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Figure 2 for ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Figure 3 for ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Figure 4 for ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Viaarxiv icon