Alert button
Picture for Shigeki Karita

Shigeki Karita

Alert button

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

Add code
Bookmark button
Alert button
Jun 07, 2023
Shigeki Karita, Richard Sproat, Haruko Ishikawa

Figure 1 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Figure 2 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Figure 3 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Viaarxiv icon

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Add code
Bookmark button
Alert button
May 30, 2023
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

Figure 1 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 2 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 3 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 4 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Bookmark button
Alert button
Mar 03, 2023
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers

Add code
Bookmark button
Alert button
Feb 16, 2022
Yotaro Kubo, Shigeki Karita, Michiel Bacchiani

Figure 1 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 2 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 3 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 4 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Viaarxiv icon

SNRi Target Training for Joint Speech Enhancement and Recognition

Add code
Bookmark button
Alert button
Nov 01, 2021
Yuma Koizumi, Shigeki Karita, Arun Narayanan, Sankaran Panchapagesan, Michiel Bacchiani

Figure 1 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 2 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 3 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 4 for SNRi Target Training for Joint Speech Enhancement and Recognition
Viaarxiv icon

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Add code
Bookmark button
Alert button
Jun 30, 2021
Yuma Koizumi, Shigeki Karita, Scott Wisdom, Hakan Erdogan, John R. Hershey, Llion Jones, Michiel Bacchiani

Figure 1 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 2 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 3 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 4 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Viaarxiv icon

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition

Add code
Bookmark button
Alert button
Jun 09, 2021
Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones

Figure 1 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 2 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 3 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 4 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Viaarxiv icon

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

Add code
Bookmark button
Alert button
Dec 23, 2020
Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang

Figure 1 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Figure 2 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Viaarxiv icon

Unsupervised Learning of Disentangled Speech Content and Style Representation

Add code
Bookmark button
Alert button
Oct 24, 2020
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita

Figure 1 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 2 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 3 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 4 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Viaarxiv icon

ESPnet-ST: All-in-One Speech Translation Toolkit

Add code
Bookmark button
Alert button
Apr 21, 2020
Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe

Figure 1 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 2 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 3 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 4 for ESPnet-ST: All-in-One Speech Translation Toolkit
Viaarxiv icon