Picture for Yu-An Chung

Yu-An Chung

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Add code
Mar 17, 2026
Viaarxiv icon

SpidR: Learning Fast and Stable Linguistic Units for Spoken Language Models Without Supervision

Add code
Dec 23, 2025
Viaarxiv icon

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Add code
Nov 12, 2025
Figure 1 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 2 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 3 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 4 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Viaarxiv icon

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Add code
Oct 31, 2024
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders

Add code
Sep 14, 2023
Figure 1 for CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Figure 2 for CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Figure 3 for CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Figure 4 for CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

Add code
Dec 15, 2022
Viaarxiv icon

Speech-to-Speech Translation For A Real-world Unwritten Language

Add code
Nov 11, 2022
Figure 1 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 2 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 3 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 4 for Speech-to-Speech Translation For A Real-world Unwritten Language
Viaarxiv icon

SSAST: Self-Supervised Audio Spectrogram Transformer

Add code
Oct 19, 2021
Figure 1 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 2 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 3 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 4 for SSAST: Self-Supervised Audio Spectrogram Transformer
Viaarxiv icon