Alert button
Picture for Yosuke Higuchi

Yosuke Higuchi

Alert button

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference

Oct 01, 2023
Masao Someki, Nicholas Eng, Yosuke Higuchi, Shinji Watanabe

Viaarxiv icon

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Sep 19, 2023
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 2 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 3 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 4 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Viaarxiv icon

Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition

Sep 09, 2023
Huaibo Zhao, Yosuke Higuchi, Yusuke Kida, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 2 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 3 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 4 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Viaarxiv icon

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Nov 10, 2022
Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe

Figure 1 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 2 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 3 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 4 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

Oct 29, 2022
Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

CTC Alignments Improve Autoregressive Translation

Oct 11, 2022
Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W Black, Shinji Watanabe

Figure 1 for CTC Alignments Improve Autoregressive Translation
Figure 2 for CTC Alignments Improve Autoregressive Translation
Figure 3 for CTC Alignments Improve Autoregressive Translation
Figure 4 for CTC Alignments Improve Autoregressive Translation
Viaarxiv icon

ESPnet-ONNX: Bridging a Gap Between Research and Production

Sep 20, 2022
Masao Someki, Yosuke Higuchi, Tomoki Hayashi, Shinji Watanabe

Figure 1 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 2 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 3 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 4 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Viaarxiv icon

Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models

Jan 26, 2022
Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang

Figure 1 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 2 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 3 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Figure 4 for Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Viaarxiv icon