Picture for Juan Pino

Juan Pino

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

Add code
Dec 24, 2025
Viaarxiv icon

SpidR: Learning Fast and Stable Linguistic Units for Spoken Language Models Without Supervision

Add code
Dec 23, 2025
Viaarxiv icon

LongTail-Swap: benchmarking language models' abilities on rare words

Add code
Oct 05, 2025
Viaarxiv icon

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

Add code
Mar 21, 2024
Figure 1 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 2 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 3 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 4 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Viaarxiv icon

SpiRit-LM: Interleaved Spoken and Written Language Model

Add code
Feb 08, 2024
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

Multilingual Speech-to-Speech Translation into Multiple Target Languages

Add code
Jul 17, 2023
Figure 1 for Multilingual Speech-to-Speech Translation into Multiple Target Languages
Figure 2 for Multilingual Speech-to-Speech Translation into Multiple Target Languages
Figure 3 for Multilingual Speech-to-Speech Translation into Multiple Target Languages
Figure 4 for Multilingual Speech-to-Speech Translation into Multiple Target Languages
Viaarxiv icon

Exploration on HuBERT with Multiple Resolutions

Add code
Jun 22, 2023
Figure 1 for Exploration on HuBERT with Multiple Resolutions
Figure 2 for Exploration on HuBERT with Multiple Resolutions
Figure 3 for Exploration on HuBERT with Multiple Resolutions
Figure 4 for Exploration on HuBERT with Multiple Resolutions
Viaarxiv icon

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

Add code
May 04, 2023
Figure 1 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 2 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 3 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 4 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Viaarxiv icon