Picture for Shigeki Karita

Shigeki Karita

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

Jun 07, 2023
Figure 1 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Figure 2 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Figure 3 for Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Viaarxiv icon

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Add code
May 30, 2023
Figure 1 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 2 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 3 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 4 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Mar 03, 2023
Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers

Feb 16, 2022
Figure 1 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 2 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 3 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 4 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Viaarxiv icon

SNRi Target Training for Joint Speech Enhancement and Recognition

Add code
Nov 01, 2021
Figure 1 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 2 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 3 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 4 for SNRi Target Training for Joint Speech Enhancement and Recognition
Viaarxiv icon

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Add code
Jun 30, 2021
Figure 1 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 2 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 3 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 4 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Viaarxiv icon

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition

Jun 09, 2021
Figure 1 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 2 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 3 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 4 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Viaarxiv icon

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

Add code
Dec 23, 2020
Figure 1 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Figure 2 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Viaarxiv icon

Unsupervised Learning of Disentangled Speech Content and Style Representation

Oct 24, 2020
Figure 1 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 2 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 3 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Figure 4 for Unsupervised Learning of Disentangled Speech Content and Style Representation
Viaarxiv icon

ESPnet-ST: All-in-One Speech Translation Toolkit

Add code
Apr 21, 2020
Figure 1 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 2 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 3 for ESPnet-ST: All-in-One Speech Translation Toolkit
Figure 4 for ESPnet-ST: All-in-One Speech Translation Toolkit
Viaarxiv icon