Picture for Chenliang Xu

Chenliang Xu

The Sword of Damocles in ViTs: Computational Redundancy Amplifies Adversarial Transferability

Add code
Apr 15, 2025
Viaarxiv icon

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Figure 1 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 2 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 3 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Figure 4 for Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
Viaarxiv icon

FreSca: Unveiling the Scaling Space in Diffusion Models

Add code
Apr 02, 2025
Viaarxiv icon

Forward Learning with Differential Privacy

Add code
Apr 01, 2025
Figure 1 for Forward Learning with Differential Privacy
Figure 2 for Forward Learning with Differential Privacy
Figure 3 for Forward Learning with Differential Privacy
Figure 4 for Forward Learning with Differential Privacy
Viaarxiv icon

VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity

Add code
Mar 14, 2025
Viaarxiv icon

Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

Add code
Feb 19, 2025
Figure 1 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 2 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 3 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Figure 4 for Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Viaarxiv icon

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

Add code
Jan 31, 2025
Figure 1 for GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
Figure 2 for GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
Figure 3 for GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
Figure 4 for GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
Viaarxiv icon

Generative AI for Cel-Animation: A Survey

Add code
Jan 08, 2025
Figure 1 for Generative AI for Cel-Animation: A Survey
Figure 2 for Generative AI for Cel-Animation: A Survey
Figure 3 for Generative AI for Cel-Animation: A Survey
Figure 4 for Generative AI for Cel-Animation: A Survey
Viaarxiv icon

Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach

Add code
Dec 24, 2024
Figure 1 for Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach
Figure 2 for Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach
Figure 3 for Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach
Figure 4 for Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach
Viaarxiv icon