Picture for Holger Schwenk

Holger Schwenk

NLLB Team

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Add code
Mar 17, 2026
Viaarxiv icon

Omnilingual MT: Machine Translation for 1,600 Languages

Add code
Mar 17, 2026
Viaarxiv icon

Unified Vision-Language Modeling via Concept Space Alignment

Add code
Mar 01, 2026
Viaarxiv icon

LCFO: Long Context and Long Form Output Dataset and Benchmarking

Add code
Dec 12, 2024
Figure 1 for LCFO: Long Context and Long Form Output Dataset and Benchmarking
Figure 2 for LCFO: Long Context and Long Form Output Dataset and Benchmarking
Figure 3 for LCFO: Long Context and Long Form Output Dataset and Benchmarking
Figure 4 for LCFO: Long Context and Long Form Output Dataset and Benchmarking
Viaarxiv icon

Large Concept Models: Language Modeling in a Sentence Representation Space

Add code
Dec 11, 2024
Figure 1 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 2 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 3 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 4 for Large Concept Models: Language Modeling in a Sentence Representation Space
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer

Add code
Oct 05, 2023
Figure 1 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 2 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 3 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 4 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

SONAR: Sentence-Level Multimodal and Language-Agnostic Representations

Add code
Aug 23, 2023
Viaarxiv icon

xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages

Add code
Jun 22, 2023
Figure 1 for xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Figure 2 for xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Figure 3 for xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Figure 4 for xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Viaarxiv icon