Alert button
Picture for Shinji Watanabe

Shinji Watanabe

Alert button

Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models

Add code
Bookmark button
Alert button
Jan 26, 2022
Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang

Figure 1 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 2 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 3 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 4 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Viaarxiv icon

Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR

Add code
Bookmark button
Alert button
Jan 25, 2022
Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, Shinji Watanabe

Figure 1 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 2 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 3 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 4 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Viaarxiv icon

A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies

Add code
Bookmark button
Alert button
Jan 14, 2022
Florian Boyer, Yusuke Shinohara, Takaaki Ishii, Hirofumi Inaguma, Shinji Watanabe

Figure 1 for A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Figure 2 for A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Figure 3 for A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Figure 4 for A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Viaarxiv icon

Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem

Add code
Bookmark button
Alert button
Jan 09, 2022
Jing Shi, Xuankai Chang, Tomoki Hayashi, Yen-Ju Lu, Shinji Watanabe, Bo Xu

Figure 1 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 2 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 3 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 4 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Viaarxiv icon

JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification

Add code
Bookmark button
Alert button
Dec 17, 2021
Shinnosuke Takamichi, Ludwig Kürzinger, Takaaki Saeki, Sayaka Shiota, Shinji Watanabe

Figure 1 for JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
Figure 2 for JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
Figure 3 for JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
Figure 4 for JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
Viaarxiv icon

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

Add code
Bookmark button
Alert button
Nov 29, 2021
Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu

Figure 1 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 2 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 3 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 4 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Viaarxiv icon

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

Add code
Bookmark button
Alert button
Nov 29, 2021
Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W Black, Shinji Watanabe

Figure 1 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 2 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 3 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 4 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Viaarxiv icon

Attention-based Multi-hypothesis Fusion for Speech Summarization

Add code
Bookmark button
Alert button
Nov 16, 2021
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe

Figure 1 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 2 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 3 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 4 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Viaarxiv icon