Picture for Armin Mustafa

Armin Mustafa

NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

Add code
Jun 10, 2024
Figure 1 for NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Figure 2 for NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Figure 3 for NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Figure 4 for NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Viaarxiv icon

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Add code
May 17, 2024
Figure 1 for CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Figure 2 for CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Figure 3 for CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Figure 4 for CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Viaarxiv icon

S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

Add code
Apr 18, 2024
Figure 1 for S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
Figure 2 for S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
Figure 3 for S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
Figure 4 for S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
Viaarxiv icon

ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet

Add code
Dec 05, 2023
Figure 1 for ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
Figure 2 for ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
Figure 3 for ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
Figure 4 for ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
Viaarxiv icon

CAD -- Contextual Multi-modal Alignment for Dynamic AVQA

Add code
Oct 27, 2023
Viaarxiv icon

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

Add code
Aug 09, 2023
Figure 1 for PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Figure 2 for PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Figure 3 for PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Figure 4 for PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Viaarxiv icon

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer

Add code
Apr 18, 2023
Viaarxiv icon

SEM-POS: Grammatically and Semantically Correct Video Captioning

Add code
Apr 04, 2023
Figure 1 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 2 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 3 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 4 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Viaarxiv icon

Pose Guided Multi-person Image Generation From Text

Add code
Mar 09, 2022
Figure 1 for Pose Guided Multi-person Image Generation From Text
Figure 2 for Pose Guided Multi-person Image Generation From Text
Figure 3 for Pose Guided Multi-person Image Generation From Text
Figure 4 for Pose Guided Multi-person Image Generation From Text
Viaarxiv icon

SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

Add code
Oct 25, 2021
Figure 1 for SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition
Figure 2 for SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition
Figure 3 for SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition
Figure 4 for SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition
Viaarxiv icon