Picture for Si-Woo Kim

Si-Woo Kim

SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning

Add code
Mar 05, 2026
Viaarxiv icon

Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning

Add code
Sep 04, 2025
Figure 1 for Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning
Figure 2 for Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning
Figure 3 for Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning
Figure 4 for Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning
Viaarxiv icon

SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning

Add code
Jul 24, 2025
Figure 1 for SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Figure 2 for SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Figure 3 for SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Figure 4 for SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Viaarxiv icon

SIDA: Synthetic Image Driven Zero-shot Domain Adaptation

Add code
Jul 24, 2025
Figure 1 for SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
Figure 2 for SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
Figure 3 for SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
Figure 4 for SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
Viaarxiv icon

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning

Add code
Dec 26, 2024
Viaarxiv icon

IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning

Add code
Sep 26, 2024
Figure 1 for IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
Figure 2 for IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
Figure 3 for IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
Figure 4 for IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
Viaarxiv icon