Picture for Joon Son Chung

Joon Son Chung

Plug-and-Steer: Decoupling Separation and Selection in Audio-Visual Target Speaker Extraction

Add code
Mar 20, 2026
Viaarxiv icon

On the Nature of Attention Sink that Shapes Decoding Strategy in MLLMs

Add code
Mar 15, 2026
Viaarxiv icon

MamTra: A Hybrid Mamba-Transformer Backbone for Speech Synthesis

Add code
Mar 12, 2026
Viaarxiv icon

UNMIXX: Untangling Highly Correlated Singing Voices Mixtures

Add code
Jan 19, 2026
Viaarxiv icon

FastAV: Efficient Token Pruning for Audio-Visual Large Language Model Inference

Add code
Jan 19, 2026
Viaarxiv icon

LAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz Divergence

Add code
Jan 08, 2026
Viaarxiv icon

LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling

Add code
Dec 23, 2025
Figure 1 for LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling
Figure 2 for LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling
Figure 3 for LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling
Figure 4 for LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling
Viaarxiv icon

TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation

Add code
Dec 23, 2025
Viaarxiv icon

Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment

Add code
Dec 08, 2025
Figure 1 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 2 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 3 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 4 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Viaarxiv icon

Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing

Add code
May 27, 2025
Figure 1 for Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
Figure 2 for Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
Figure 3 for Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
Figure 4 for Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
Viaarxiv icon