Picture for Amanmeet Garg

Amanmeet Garg

Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models

Add code
Nov 05, 2023
Figure 1 for Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
Figure 2 for Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
Figure 3 for Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
Figure 4 for Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
Viaarxiv icon

Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment

Add code
Jul 24, 2023
Figure 1 for Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
Figure 2 for Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
Figure 3 for Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
Figure 4 for Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
Viaarxiv icon

PodSumm -- Podcast Audio Summarization

Add code
Sep 22, 2020
Figure 1 for PodSumm -- Podcast Audio Summarization
Figure 2 for PodSumm -- Podcast Audio Summarization
Figure 3 for PodSumm -- Podcast Audio Summarization
Viaarxiv icon