Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Modular Hybrid Autoregressive Transducer


Oct 31, 2022
Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno

Add code

* 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar 
* 8 pages, 1 figure, SLT 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech


Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR


Oct 19, 2022
Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park

Add code

* 6 pages, accepted at SLT 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR


Oct 18, 2022
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro Moreno, Nanxin Chen

Add code

* Accepted by SLT 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Non-Parallel Voice Conversion for ASR Augmentation


Sep 15, 2022
Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Yinghui Huang, Jesse Emond, Pedro Moreno Mengibar

Add code

* Accepted by Interspeech 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition


Sep 13, 2022
Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran, Pedro J. Moreno

Add code

* Accepted for publication in Interspeech 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data


May 16, 2022
Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang

Add code

* 5 pages, 3 tables 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Improving Rare Word Recognition with LM-aware MWER Training


Apr 15, 2022
Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Cal Peyser, Trevor Strohman, Yanzhang He, David Rybach

Add code

* In submission to INTERSPEECH 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

MAESTRO: Matched Speech Text Representations through Modality Matching


Apr 07, 2022
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Ankur Bapna, Heiga Zen

Add code

* Submitted to Interspeech 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Ask2Mask: Guided Data Selection for Masked Speech Modeling


Feb 24, 2022
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro Moreno

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
>>