Alert button
Picture for Shinji Watanabe

Shinji Watanabe

Alert button

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Nov 10, 2022
Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe

Figure 1 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 2 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 3 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 4 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Viaarxiv icon

Towards Zero-Shot Code-Switched Speech Recognition

Nov 09, 2022
Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe

Figure 1 for Towards Zero-Shot Code-Switched Speech Recognition
Figure 2 for Towards Zero-Shot Code-Switched Speech Recognition
Figure 3 for Towards Zero-Shot Code-Switched Speech Recognition
Figure 4 for Towards Zero-Shot Code-Switched Speech Recognition
Viaarxiv icon

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Nov 06, 2022
Jiatong Shi, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, Hung-yi Lee

Figure 1 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 2 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 3 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 4 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Viaarxiv icon

Multi-blank Transducers for Speech Recognition

Nov 04, 2022
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg

Figure 1 for Multi-blank Transducers for Speech Recognition
Figure 2 for Multi-blank Transducers for Speech Recognition
Figure 3 for Multi-blank Transducers for Speech Recognition
Figure 4 for Multi-blank Transducers for Speech Recognition
Viaarxiv icon

Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition

Nov 04, 2022
Yusuke Shinohara, Shinji Watanabe

Figure 1 for Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Figure 2 for Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Figure 3 for Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

Avoid Overthinking in Self-Supervised Models for Speech Recognition

Nov 01, 2022
Dan Berrebbi, Brian Yan, Shinji Watanabe

Figure 1 for Avoid Overthinking in Self-Supervised Models for Speech Recognition
Figure 2 for Avoid Overthinking in Self-Supervised Models for Speech Recognition
Figure 3 for Avoid Overthinking in Self-Supervised Models for Speech Recognition
Figure 4 for Avoid Overthinking in Self-Supervised Models for Speech Recognition
Viaarxiv icon

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

Oct 29, 2022
Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon