Picture for Armin Mustafa

Armin Mustafa

SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

Add code
Jun 13, 2025
Viaarxiv icon

PAL: Probing Audio Encoders via LLMs -- A Study of Information Transfer from Audio Encoders to LLMs

Add code
Jun 12, 2025
Viaarxiv icon

Joint Reconstruction of Spatially-Coherent and Realistic Clothed Humans and Objects from a Single Image

Add code
Feb 25, 2025
Viaarxiv icon

Deconstruct Complexity (DeComplex): A Novel Perspective on Tackling Dense Action Detection

Add code
Jan 30, 2025
Viaarxiv icon

Efficient Audio-Visual Fusion for Video Classification

Add code
Nov 08, 2024
Viaarxiv icon

Boosting Camera Motion Control for Video Diffusion Transformers

Add code
Oct 14, 2024
Figure 1 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 2 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 3 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 4 for Boosting Camera Motion Control for Video Diffusion Transformers
Viaarxiv icon

RenDetNet: Weakly-supervised Shadow Detection with Shadow Caster Verification

Add code
Aug 30, 2024
Viaarxiv icon

Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification

Add code
Aug 26, 2024
Figure 1 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 2 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 3 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 4 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Viaarxiv icon

Single-image coherent reconstruction of objects and humans

Add code
Aug 15, 2024
Viaarxiv icon

NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

Add code
Jun 10, 2024
Viaarxiv icon