Picture for Ankur Bapna

Ankur Bapna

Dima

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Mar 03, 2023
Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models

Add code
Dec 19, 2022
Figure 1 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 2 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 3 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 4 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Add code
Oct 27, 2022
Viaarxiv icon

Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR

Add code
Oct 18, 2022
Figure 1 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 2 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 3 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 4 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Viaarxiv icon

JOIST: A Joint Speech and Text Streaming Model For ASR

Add code
Oct 13, 2022
Figure 1 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 2 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 3 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 4 for JOIST: A Joint Speech and Text Streaming Model For ASR
Viaarxiv icon

SQuId: Measuring Speech Naturalness in Many Languages

Add code
Oct 12, 2022
Figure 1 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 2 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 3 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 4 for SQuId: Measuring Speech Naturalness in Many Languages
Viaarxiv icon

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

Add code
May 25, 2022
Figure 1 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 2 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 3 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 4 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Viaarxiv icon

Building Machine Translation Systems for the Next Thousand Languages

Add code
May 16, 2022
Figure 1 for Building Machine Translation Systems for the Next Thousand Languages
Figure 2 for Building Machine Translation Systems for the Next Thousand Languages
Figure 3 for Building Machine Translation Systems for the Next Thousand Languages
Figure 4 for Building Machine Translation Systems for the Next Thousand Languages
Viaarxiv icon

XTREME-S: Evaluating Cross-lingual Speech Representations

Add code
Apr 13, 2022
Figure 1 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 2 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 3 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 4 for XTREME-S: Evaluating Cross-lingual Speech Representations
Viaarxiv icon