Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed
Augmentation adversarial training for unsupervised speaker recognition

Aug 09, 2020
Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, Joon Son Chung


  Access Model/Code and Paper
Neural Speaker Diarization with Speaker-Wise Chain Rule

Jun 02, 2020
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, Kenji Nagamatsu

* Submitted to Interspeech 2020 

  Access Model/Code and Paper
End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

May 20, 2020
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu


  Access Model/Code and Paper
DiscreTalk: Text-to-Speech as a Machine Translation Problem

May 12, 2020
Tomoki Hayashi, Shinji Watanabe

* Submitted to INTERSPEECH 2020. The demo is available on https://kan-bayashi.github.io/DiscreTalk/ 

  Access Model/Code and Paper
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

May 02, 2020
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant


  Access Model/Code and Paper
ESPnet-ST: All-in-One Speech Translation Toolkit

Apr 21, 2020
Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe

* Accepted at ACL 2020 System Demonstration 

  Access Model/Code and Paper
End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification

Feb 24, 2020
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu

* Submission to IEEE TASLP. This article draws from our previous conference papers: arxiv:1909.06247 and arxiv:1909.05952 

  Access Model/Code and Paper
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

Feb 14, 2020
Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, Shinji Watanabe

* Submitted to ICASSP 2020 

  Access Model/Code and Paper
End-to-End Multi-speaker Speech Recognition with Transformer

Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

* To appear in ICASSP 2020 

  Access Model/Code and Paper
Non-Autoregressive Transformer Automatic Speech Recognition

Nov 10, 2019
Nanxin Chen, Shinji Watanabe, Jesús Villalba, Najim Dehak


  Access Model/Code and Paper
Multilingual End-to-End Speech Translation

Oct 31, 2019
Hirofumi Inaguma, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe

* Accepted to ASRU 2019 

  Access Model/Code and Paper
Towards Online End-to-end Transformer Automatic Speech Recognition

Oct 25, 2019
Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe

* arXiv admin note: text overlap with arXiv:1910.07204 

  Access Model/Code and Paper
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

Oct 24, 2019
Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan

* Submitted to ICASSP2020. Demo HP: https://espnet.github.io/icassp2020-tts/ 

  Access Model/Code and Paper
A practical two-stage training strategy for multi-stream end-to-end speech recognition

Oct 23, 2019
Ruizhi Li, Gregory Sell, Xiaofei Wang, Shinji Watanabe, Hynek Hermansky

* submitted to ICASSP 2019 

  Access Model/Code and Paper
Transformer ASR with Contextual Block Processing

Oct 16, 2019
Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe

* Accepted for ASRU 2019 

  Access Model/Code and Paper
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition

Oct 16, 2019
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

* Accepted at ASRU 2019 

  Access Model/Code and Paper
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

Oct 15, 2019
Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur

* Accepted to ASRU 2019 

  Access Model/Code and Paper
A Comparative Study on Transformer vs RNN in Speech Applications

Sep 28, 2019
Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang

* IEEE Automatic Speech Recognition and Understanding Workshop 2019 
* Accepted at ASRU 2019 

  Access Model/Code and Paper
Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models

Sep 17, 2019
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

* Accepted to ASRU 2019 

  Access Model/Code and Paper
End-to-End Neural Speaker Diarization with Self-attention

Sep 13, 2019
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

* Accepted for ASRU 2019 

  Access Model/Code and Paper
End-to-End Neural Speaker Diarization with Permutation-Free Objectives

Sep 12, 2019
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe

* Accepted to INTERSPEECH 2019 

  Access Model/Code and Paper
Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition

Jun 26, 2019
Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, Shinji Watanabe

* Accepted to INTERSPEECH 2019 

  Access Model/Code and Paper
Multi-Stream End-to-End Speech Recognition

Jun 17, 2019
Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Shinji Watanabe, Takaaki Hori, Hynek Hermansky

* submitted to IEEE TASLP. arXiv admin note: substantial text overlap with arXiv:1811.04897, arXiv:1811.04903 

  Access Model/Code and Paper