Picture for Swapnil Bhosale

Swapnil Bhosale

NEAR$^2$: A Nested Embedding Approach to Efficient Product Retrieval and Ranking

Add code
Jun 24, 2025
Viaarxiv icon

3D Audio-Visual Segmentation

Add code
Nov 04, 2024
Viaarxiv icon

Centrality-aware Product Retrieval and Ranking

Add code
Oct 21, 2024
Viaarxiv icon

AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis

Add code
Jun 14, 2024
Viaarxiv icon

Unsupervised Audio-Visual Segmentation with Modality Alignment

Add code
Mar 21, 2024
Viaarxiv icon

Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection

Add code
Sep 29, 2023
Figure 1 for Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Figure 2 for Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Figure 3 for Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Figure 4 for Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Viaarxiv icon

Leveraging Foundation models for Unsupervised Audio-Visual Segmentation

Add code
Sep 13, 2023
Viaarxiv icon

DiffSED: Sound Event Detection with Denoising Diffusion

Add code
Aug 16, 2023
Viaarxiv icon

Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity

Add code
Oct 03, 2022
Figure 1 for Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Figure 2 for Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Figure 3 for Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Figure 4 for Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Viaarxiv icon

Automatic Audio Captioning using Attention weighted Event based Embeddings

Add code
Jan 28, 2022
Viaarxiv icon