Picture for Gunhee Kim

Gunhee Kim

Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech

Add code
Sep 19, 2025
Viaarxiv icon

WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations

Add code
Aug 28, 2025
Viaarxiv icon

Hybrid Deep Searcher: Integrating Parallel and Sequential Search Reasoning

Add code
Aug 26, 2025
Viaarxiv icon

FedMeNF: Privacy-Preserving Federated Meta-Learning for Neural Fields

Add code
Aug 08, 2025
Viaarxiv icon

Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations

Add code
Jul 27, 2025
Viaarxiv icon

ViSAGe: Video-to-Spatial Audio Generation

Add code
Jun 13, 2025
Viaarxiv icon

HalLoc: Token-level Localization of Hallucinations for Vision Language Models

Add code
Jun 12, 2025
Viaarxiv icon

Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Add code
May 28, 2025
Viaarxiv icon

LPOI: Listwise Preference Optimization for Vision Language Models

Add code
May 27, 2025
Viaarxiv icon

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

Add code
May 12, 2025
Viaarxiv icon