Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed
Dual-Path Modeling for Long Recording Speech Separation in Meetings

Feb 23, 2021
Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian

* Accepted by ICASSP 2021 

  Access Paper or Ask Questions

End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend

Feb 23, 2021
Wangyou Zhang, Christoph Boeddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian

* 5 pages, 1 figure, accepted by ICASSP 2021 

  Access Paper or Ask Questions

Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition

Feb 18, 2021
Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe

* Accepted to ICASSP2021 

  Access Paper or Ask Questions

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

Feb 16, 2021
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

* Submitted to Computer Speech & Language 

  Access Paper or Ask Questions

Intermediate Loss Regularization for CTC-based Speech Recognition

Feb 05, 2021
Jaesong Lee, Shinji Watanabe

* Accepted at ICASSP 2021 

  Access Paper or Ask Questions

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

Feb 02, 2021
Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur


  Access Paper or Ask Questions

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yolox贸chitl Mixtec

Jan 26, 2021
Jiatong Shi. Jonathan D. Amith, Rey Castillo Garc铆a, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe

* Accepted by EACL2021 

  Access Paper or Ask Questions

A Review of Speaker Diarization: Recent Advances with Deep Learning

Jan 24, 2021
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan


  Access Paper or Ask Questions

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Jan 21, 2021
Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, Kenji Nagamatsu


  Access Paper or Ask Questions

Arabic Speech Recognition by End-to-End, Modular Systems and Human

Jan 21, 2021
Amir Hussein, Shinji Watanabe, Ahmed Ali


  Access Paper or Ask Questions

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

Dec 23, 2020
Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang


  Access Paper or Ask Questions

End-to-End Speaker Diarization as Post-Processing

Dec 23, 2020
Shota Horiguchi, Paola Garcia, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu


  Access Paper or Ask Questions

Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder

Nov 06, 2020
Hirofumi Inaguma, Yosuke Higuchi, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe


  Access Paper or Ask Questions

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

Oct 30, 2020
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

* submitted to ICASSP 2021 

  Access Paper or Ask Questions

Improved Mask-CTC for Non-Autoregressive End-to-End ASR

Oct 26, 2020
Yosuke Higuchi, Hirofumi Inaguma, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi

* Submitted to ICASSP2021 

  Access Paper or Ask Questions

Learning Speaker Embedding from Text-to-Speech

Oct 21, 2020
Jaejin Cho, Piotr Zelasko, Jesus Villalba, Shinji Watanabe, Najim Dehak


  Access Paper or Ask Questions

The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

Oct 06, 2020
Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda

* Accepted to the ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 

  Access Paper or Ask Questions

Augmentation adversarial training for unsupervised speaker recognition

Aug 09, 2020
Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, Joon Son Chung


  Access Paper or Ask Questions

Neural Speaker Diarization with Speaker-Wise Chain Rule

Jun 02, 2020
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, Kenji Nagamatsu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

May 20, 2020
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu


  Access Paper or Ask Questions

DiscreTalk: Text-to-Speech as a Machine Translation Problem

May 12, 2020
Tomoki Hayashi, Shinji Watanabe

* Submitted to INTERSPEECH 2020. The demo is available on https://kan-bayashi.github.io/DiscreTalk/ 

  Access Paper or Ask Questions

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

May 02, 2020
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant


  Access Paper or Ask Questions

ESPnet-ST: All-in-One Speech Translation Toolkit

Apr 21, 2020
Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe

* Accepted at ACL 2020 System Demonstration 

  Access Paper or Ask Questions

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification

Feb 24, 2020
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu

* Submission to IEEE TASLP. This article draws from our previous conference papers: arxiv:1909.06247 and arxiv:1909.05952 

  Access Paper or Ask Questions

End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

Feb 14, 2020
Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, Shinji Watanabe

* Submitted to ICASSP 2020 

  Access Paper or Ask Questions

End-to-End Multi-speaker Speech Recognition with Transformer

Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

* To appear in ICASSP 2020 

  Access Paper or Ask Questions