Picture for Suha Kwak

Suha Kwak

Improving Sound Source Localization with Joint Slot Attention on Image and Audio

Add code
Apr 21, 2025
Figure 1 for Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Figure 2 for Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Figure 3 for Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Figure 4 for Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Viaarxiv icon

DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Add code
Apr 07, 2025
Figure 1 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Figure 2 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Figure 3 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Figure 4 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Viaarxiv icon

Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval

Add code
Apr 03, 2025
Viaarxiv icon

GENIUS: A Generative Framework for Universal Multimodal Search

Add code
Mar 25, 2025
Figure 1 for GENIUS: A Generative Framework for Universal Multimodal Search
Figure 2 for GENIUS: A Generative Framework for Universal Multimodal Search
Figure 3 for GENIUS: A Generative Framework for Universal Multimodal Search
Figure 4 for GENIUS: A Generative Framework for Universal Multimodal Search
Viaarxiv icon

Enhancing Cost Efficiency in Active Learning with Candidate Set Query

Add code
Feb 10, 2025
Figure 1 for Enhancing Cost Efficiency in Active Learning with Candidate Set Query
Figure 2 for Enhancing Cost Efficiency in Active Learning with Candidate Set Query
Figure 3 for Enhancing Cost Efficiency in Active Learning with Candidate Set Query
Figure 4 for Enhancing Cost Efficiency in Active Learning with Candidate Set Query
Viaarxiv icon

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens

Add code
Jan 13, 2025
Figure 1 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 2 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 3 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 4 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Viaarxiv icon

Improving Text-based Person Search via Part-level Cross-modal Correspondence

Add code
Dec 31, 2024
Figure 1 for Improving Text-based Person Search via Part-level Cross-modal Correspondence
Figure 2 for Improving Text-based Person Search via Part-level Cross-modal Correspondence
Figure 3 for Improving Text-based Person Search via Part-level Cross-modal Correspondence
Figure 4 for Improving Text-based Person Search via Part-level Cross-modal Correspondence
Viaarxiv icon

ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation

Add code
Dec 05, 2024
Viaarxiv icon

Bootstrapping Top-down Information for Self-modulating Slot Attention

Add code
Nov 04, 2024
Figure 1 for Bootstrapping Top-down Information for Self-modulating Slot Attention
Figure 2 for Bootstrapping Top-down Information for Self-modulating Slot Attention
Figure 3 for Bootstrapping Top-down Information for Self-modulating Slot Attention
Figure 4 for Bootstrapping Top-down Information for Self-modulating Slot Attention
Viaarxiv icon

Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization

Add code
Sep 05, 2024
Figure 1 for Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Figure 2 for Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Figure 3 for Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Figure 4 for Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Viaarxiv icon