Picture for Zeliang Zhang

Zeliang Zhang

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Add code
Oct 06, 2025
Viaarxiv icon

AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning

Add code
Oct 02, 2025
Viaarxiv icon

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

Add code
May 26, 2025
Viaarxiv icon

The Sword of Damocles in ViTs: Computational Redundancy Amplifies Adversarial Transferability

Add code
Apr 15, 2025
Viaarxiv icon

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Forward Learning with Differential Privacy

Add code
Apr 01, 2025
Figure 1 for Forward Learning with Differential Privacy
Figure 2 for Forward Learning with Differential Privacy
Figure 3 for Forward Learning with Differential Privacy
Figure 4 for Forward Learning with Differential Privacy
Viaarxiv icon

Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

Add code
Feb 19, 2025
Figure 1 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 2 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 3 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 4 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Viaarxiv icon

Generative AI for Cel-Animation: A Survey

Add code
Jan 08, 2025
Viaarxiv icon

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Add code
Nov 19, 2024
Figure 1 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 2 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 3 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 4 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Viaarxiv icon

Will the Inclusion of Generated Data Amplify Bias Across Generations in Future Image Classification Models?

Add code
Oct 14, 2024
Viaarxiv icon