Alert button
Picture for Ankur Bapna

Ankur Bapna

Alert button

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Bookmark button
Alert button
Mar 02, 2023
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models

Add code
Bookmark button
Alert button
Dec 19, 2022
Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna

Figure 1 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 2 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 3 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Figure 4 for Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Add code
Bookmark button
Alert button
Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon

Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR

Add code
Bookmark button
Alert button
Oct 18, 2022
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro Moreno, Nanxin Chen

Figure 1 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 2 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 3 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Figure 4 for Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Viaarxiv icon

JOIST: A Joint Speech and Text Streaming Model For ASR

Add code
Bookmark button
Alert button
Oct 13, 2022
Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

Figure 1 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 2 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 3 for JOIST: A Joint Speech and Text Streaming Model For ASR
Figure 4 for JOIST: A Joint Speech and Text Streaming Model For ASR
Viaarxiv icon

SQuId: Measuring Speech Naturalness in Many Languages

Add code
Bookmark button
Alert button
Oct 12, 2022
Thibault Sellam, Ankur Bapna, Joshua Camp, Diana Mackinnon, Ankur P. Parikh, Jason Riesa

Figure 1 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 2 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 3 for SQuId: Measuring Speech Naturalness in Many Languages
Figure 4 for SQuId: Measuring Speech Naturalness in Many Languages
Viaarxiv icon

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

Add code
Bookmark button
Alert button
May 25, 2022
Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna

Figure 1 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 2 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 3 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 4 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Viaarxiv icon

Building Machine Translation Systems for the Next Thousand Languages

Add code
Bookmark button
Alert button
May 16, 2022
Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes

Figure 1 for Building Machine Translation Systems for the Next Thousand Languages
Figure 2 for Building Machine Translation Systems for the Next Thousand Languages
Figure 3 for Building Machine Translation Systems for the Next Thousand Languages
Figure 4 for Building Machine Translation Systems for the Next Thousand Languages
Viaarxiv icon

XTREME-S: Evaluating Cross-lingual Speech Representations

Add code
Bookmark button
Alert button
Apr 13, 2022
Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan Van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Figure 1 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 2 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 3 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 4 for XTREME-S: Evaluating Cross-lingual Speech Representations
Viaarxiv icon

MAESTRO: Matched Speech Text Representations through Modality Matching

Add code
Bookmark button
Alert button
Apr 07, 2022
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Ankur Bapna, Heiga Zen

Figure 1 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 2 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 3 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 4 for MAESTRO: Matched Speech Text Representations through Modality Matching
Viaarxiv icon