Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Shinji Watanabe

SUPERB: Speech processing Universal PERformance Benchmark


May 03, 2021
Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee

* Submitted to Interspeech 2021 

  Access Paper or Ask Questions

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks


May 02, 2021
Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe

* NAACL 2021. All code and models are released as part of the ESPnet toolkit: https://github.com/espnet/espnet 

  Access Paper or Ask Questions

EAT: Enhanced ASR-TTS for Self-supervised Speech Recognition


Apr 13, 2021
Murali Karthick Baskar, Luk谩拧 Burget, Shinji Watanabe, Ramon Fernandez Astudillo, Jan "Honza'' 膶ernock媒


  Access Paper or Ask Questions

Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation


Apr 13, 2021
Hirofumi Inaguma, Tatsuya Kawahara, Shinji Watanabe

* Accepted at NAACL-HLT 2021 (short paper) 

  Access Paper or Ask Questions

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition


Apr 06, 2021
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

* 5 pages, 1 figure. Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing


Apr 02, 2021
Wei Rao, Yihui Fu, Yanxin Hu, Xin Xu, Yvkai Jv, Jiangyu Han, Zhongjie Jiang, Lei Xie, Yannan Wang, Shinji Watanabe, Zheng-Hua Tan, Hui Bu, Tao Yu, Shidong Shang

* 5 pages, submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yolox贸chitl Mixtec


Feb 26, 2021
Jiatong Shi. Jonathan D. Amith, Rey Castillo Garc铆a, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe

* Accepted by EACL2021 

  Access Paper or Ask Questions

Dual-Path Modeling for Long Recording Speech Separation in Meetings


Feb 23, 2021
Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian

* Accepted by ICASSP 2021 

  Access Paper or Ask Questions

End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend


Feb 23, 2021
Wangyou Zhang, Christoph Boeddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian

* 5 pages, 1 figure, accepted by ICASSP 2021 

  Access Paper or Ask Questions

Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition


Feb 18, 2021
Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe

* Accepted to ICASSP2021 

  Access Paper or Ask Questions

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition


Feb 16, 2021
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

* Submitted to Computer Speech & Language 

  Access Paper or Ask Questions

Intermediate Loss Regularization for CTC-based Speech Recognition


Feb 05, 2021
Jaesong Lee, Shinji Watanabe

* Accepted at ICASSP 2021 

  Access Paper or Ask Questions

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap


Feb 02, 2021
Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur


  Access Paper or Ask Questions

A Review of Speaker Diarization: Recent Advances with Deep Learning


Jan 24, 2021
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan


  Access Paper or Ask Questions

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers


Jan 21, 2021
Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, Kenji Nagamatsu


  Access Paper or Ask Questions

Arabic Speech Recognition by End-to-End, Modular Systems and Human


Jan 21, 2021
Amir Hussein, Shinji Watanabe, Ahmed Ali


  Access Paper or Ask Questions

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans


Dec 23, 2020
Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang


  Access Paper or Ask Questions

End-to-End Speaker Diarization as Post-Processing


Dec 23, 2020
Shota Horiguchi, Paola Garcia, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu


  Access Paper or Ask Questions

Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder


Nov 06, 2020
Hirofumi Inaguma, Yosuke Higuchi, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe


  Access Paper or Ask Questions

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization


Oct 30, 2020
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

* submitted to ICASSP 2021 

  Access Paper or Ask Questions

Improved Mask-CTC for Non-Autoregressive End-to-End ASR


Oct 26, 2020
Yosuke Higuchi, Hirofumi Inaguma, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi

* Submitted to ICASSP2021 

  Access Paper or Ask Questions

Learning Speaker Embedding from Text-to-Speech


Oct 21, 2020
Jaejin Cho, Piotr Zelasko, Jesus Villalba, Shinji Watanabe, Najim Dehak


  Access Paper or Ask Questions

The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS


Oct 06, 2020
Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda

* Accepted to the ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 

  Access Paper or Ask Questions

Augmentation adversarial training for unsupervised speaker recognition


Aug 09, 2020
Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, Joon Son Chung


  Access Paper or Ask Questions

Neural Speaker Diarization with Speaker-Wise Chain Rule


Jun 02, 2020
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, Kenji Nagamatsu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors


May 20, 2020
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu


  Access Paper or Ask Questions