Alert button
Picture for Tetsunori Kobayashi

Tetsunori Kobayashi

Alert button

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Bookmark button
Alert button
Oct 12, 2023
Kohei Saijo, Wangyou Zhang, Zhong-Qiu Wang, Shinji Watanabe, Tetsunori Kobayashi, Tetsuji Ogawa

Figure 1 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 2 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 3 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 4 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Viaarxiv icon

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Sep 19, 2023
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 2 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 3 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 4 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Viaarxiv icon

Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Sep 09, 2023
Huaibo Zhao, Yosuke Higuchi, Yusuke Kida, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 2 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 3 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 4 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Viaarxiv icon

Conversation-oriented ASR with multi-look-ahead CBS architecture

Add code
Bookmark button
Alert button
Nov 02, 2022
Huaibo Zhao, Shinya Fujie, Tetsuji Ogawa, Jin Sakuma, Yusuke Kida, Tetsunori Kobayashi

Figure 1 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Figure 2 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Figure 3 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Bookmark button
Alert button
Oct 29, 2022
Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR

Add code
Bookmark button
Alert button
Oct 20, 2021
Huaibo Zhao, Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Figure 2 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Figure 3 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Viaarxiv icon

Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units

Add code
Bookmark button
Alert button
Oct 08, 2021
Yosuke Higuchi, Keita Karube, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Figure 2 for Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Figure 3 for Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Figure 4 for Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Viaarxiv icon

Improved Mask-CTC for Non-Autoregressive End-to-End ASR

Add code
Bookmark button
Alert button
Oct 26, 2020
Yosuke Higuchi, Hirofumi Inaguma, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi

Figure 1 for Improved Mask-CTC for Non-Autoregressive End-to-End ASR
Figure 2 for Improved Mask-CTC for Non-Autoregressive End-to-End ASR
Figure 3 for Improved Mask-CTC for Non-Autoregressive End-to-End ASR
Viaarxiv icon