Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Shinji Watanabe

ESPnet2-TTS: Extending the Edge of TTS Research


Oct 15, 2021
Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe

* Submitted to ICASSP2022. Demo HP: https://espnet.github.io/icassp2022-tts/ 

  Access Paper or Ask Questions

S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations


Oct 12, 2021
Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda

* Submitted to ICASSP 2022. Code available at: https://github.com/s3prl/s3prl/tree/master/s3prl/downstream/a2o-vc-vcc2020 

  Access Paper or Ask Questions

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition


Oct 11, 2021
Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe


  Access Paper or Ask Questions

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation


Oct 11, 2021
Yosuke Higuchi, Nanxin Chen, Yuya Fujita, Hirofumi Inaguma, Tatsuya Komatsu, Jaesong Lee, Jumon Nozaki, Tianzi Wang, Shinji Watanabe

* Accepted to ASRU2021 

  Access Paper or Ask Questions

Multi-Channel End-to-End Neural Diarization with Distributed Microphones


Oct 10, 2021
Shota Horiguchi, Yuki Takashima, Paola Garcia, Shinji Watanabe, Yohei Kawaguchi


  Access Paper or Ask Questions

An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition


Oct 09, 2021
Xuankai Chang, Takashi Maekaku, Pengcheng Guo, Jing Shi, Yen-Ju Lu, Aswin Shanmugam Subramanian, Tianzi Wang, Shu-wen Yang, Yu Tsao, Hung-yi Lee, Shinji Watanabe

* To appear in ASRU2021 

  Access Paper or Ask Questions

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates


Sep 27, 2021
Hirofumi Inaguma, Siddharth Dalmia, Brian Yan, Shinji Watanabe

* Accepted at IEEE ASRU 2021 

  Access Paper or Ask Questions

Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring


Sep 09, 2021
Hirofumi Inaguma, Yosuke Higuchi, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe


  Access Paper or Ask Questions

Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker


Aug 07, 2021
Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe


  Access Paper or Ask Questions

A Study on Speech Enhancement Based on Diffusion Probabilistic Model


Jul 25, 2021
Yen-Ju Lu, Yu Tsao, Shinji Watanabe

* submitted to APSIPA 2021 

  Access Paper or Ask Questions

Differentiable Allophone Graphs for Language-Universal Speech Recognition


Jul 24, 2021
Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe

* INTERSPEECH 2021. Contains additional studies on phone recognition for unseen languages 

  Access Paper or Ask Questions

On Prosody Modeling for ASR+TTS based Voice Conversion


Jul 20, 2021
Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda

* Submitted to ASRU2021. Under review 

  Access Paper or Ask Questions

Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models


Jul 20, 2021
Tianzi Wang, Yuya Fujita, Xuankai Chang, Shinji Watanabe

* 5 pages, 1 figures, Interspeech21 conference 

  Access Paper or Ask Questions

Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021


Jul 13, 2021
Takashi Maekaku, Xuankai Chang, Yuya Fujita, Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky


  Access Paper or Ask Questions

ESPnet-ST IWSLT 2021 Offline Speech Translation System


Jul 06, 2021
Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Guo, Jiatong Shi, Kevin Duh, Shinji Watanabe

* IWSLT 2021 

  Access Paper or Ask Questions

Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors


Jul 04, 2021
Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yawen Xue, Yuki Takashima, Yohei Kawaguchi


  Access Paper or Ask Questions

Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding


Jun 29, 2021
Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W Black

* INTERSPEECH 2021 

  Access Paper or Ask Questions

Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization


Jun 20, 2021
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola Garcia

* Submitted to IEEE TASLP. This article is based on our previous conference paper arxiv:2005.09921 

  Access Paper or Ask Questions

Multi-mode Transformer Transducer with Stochastic Future Context


Jun 17, 2021
Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

* Accepted to Interspeech 2021 

  Access Paper or Ask Questions

Layer Pruning on Demand with Intermediate CTC


Jun 17, 2021
Jaesong Lee, Jingu Kang, Shinji Watanabe

* Interspeech 2021 

  Access Paper or Ask Questions

Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain


Jun 16, 2021
Pengcheng Guo, Xuankai Chang, Shinji Watanabe, Lei Xie

* Accepted by Interspeech 2021 

  Access Paper or Ask Questions

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio


Jun 13, 2021
Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan


  Access Paper or Ask Questions

Leveraging Pre-trained Language Model for Speech Sentiment Analysis


Jun 11, 2021
Suwon Shon, Pablo Brusco, Jing Pan, Kyu J. Han, Shinji Watanabe

* To appear in Interspeech 2021 

  Access Paper or Ask Questions

Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization


Jun 09, 2021
Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola García, Kenji Nagamatsu

* Accepted for Interspeech 2021 

  Access Paper or Ask Questions

End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection


Jun 08, 2021
Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola García, Kenji Nagamatsu

* IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 849-856 
* Accepted for SLT 2021 

  Access Paper or Ask Questions

Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios


Jun 07, 2021
Emiru Tsunoo, Kentaro Shibata, Chaitanya Narisetty, Yosuke Kashiwagi, Shinji Watanabe

* Accepted for Interspeech2021 

  Access Paper or Ask Questions

Self-Guided Curriculum Learning for Neural Machine Translation


May 15, 2021
Lei Zhou, Liang Ding, Kevin Duh, Shinji Watanabe, Ryohei Sasano, Koichi Takeda

* Work in progress 

  Access Paper or Ask Questions

End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings


May 05, 2021
Soumi Maiti, Hakan Erdogan, Kevin Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey

* ICASSP 2021, SPE-54.1 
* 5 pages, 2 figures, ICASSP 2021 

  Access Paper or Ask Questions

SUPERB: Speech processing Universal PERformance Benchmark


May 03, 2021
Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee

* Submitted to Interspeech 2021 

  Access Paper or Ask Questions

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks


May 02, 2021
Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe

* NAACL 2021. All code and models are released as part of the ESPnet toolkit: https://github.com/espnet/espnet 

  Access Paper or Ask Questions