Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation


Mar 14, 2023
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro

Add code

* Presentation accepted at ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator


Feb 27, 2023
Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations


Feb 16, 2023
Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg

Add code

* Published as a conference paper at ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition


Dec 16, 2022
Aleksandr Laptev, Boris Ginsburg

Add code

* To appear in Proc. SLT 2022, Jan 09-12, 2023, Doha, Qatar. 8 pages, 4 figures, 4 tables 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models


Nov 09, 2022
Travis M. Bartley, Fei Jia, Krishna C. Puvvada, Samuel Kriman, Boris Ginsburg

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Multi-blank Transducers for Speech Recognition


Nov 04, 2022
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers


Nov 01, 2022
Cheng-Ping Hsieh, Subhankar Ghosh, Boris Ginsburg

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

AmberNet: A Compact End-to-End Model for Spoken Language Identification


Oct 27, 2022
Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition


Oct 06, 2022
Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg

Add code

* To appear in Proc. SLT 2022, Jan 09-12, 2023, Doha, Qatar 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Thutmose Tagger: Single-pass neural model for Inverse Text Normalization


Jul 29, 2022
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
>>