Alert button

"speech": models, code, and papers
Alert button

Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech

Add code
Bookmark button
Alert button
Sep 07, 2022
Huu-Tien Dang, Thi-Hai-Yen Vuong, Xuan-Hieu Phan

Figure 1 for Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Figure 2 for Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Figure 3 for Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Figure 4 for Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Viaarxiv icon

Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition

Oct 11, 2021
Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu

Figure 1 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 2 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 3 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 4 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Viaarxiv icon

Towards Error-Resilient Neural Speech Coding

Jul 03, 2022
Huaying Xue, Xiulian Peng, Xue Jiang, Yan Lu

Figure 1 for Towards Error-Resilient Neural Speech Coding
Figure 2 for Towards Error-Resilient Neural Speech Coding
Figure 3 for Towards Error-Resilient Neural Speech Coding
Figure 4 for Towards Error-Resilient Neural Speech Coding
Viaarxiv icon

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Apr 11, 2022
Jian Xue, Peidong Wang, Jinyu Li, Matt Post, Yashesh Gaur

Figure 1 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 2 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 3 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 4 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Viaarxiv icon

SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks

Add code
Bookmark button
Alert button
Mar 26, 2022
Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson

Figure 1 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 2 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 3 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 4 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Viaarxiv icon

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations

Dec 19, 2022
Shelly Jain, Priyanshi Pal, Anil Vuppala, Prasanta Ghosh, Chiranjeevi Yarra

Figure 1 for An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations
Figure 2 for An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations
Figure 3 for An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations
Figure 4 for An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations
Viaarxiv icon

Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

Oct 28, 2022
Sebastian P. Bayerl, Dominik Wagner, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer

Figure 1 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 2 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 3 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Viaarxiv icon

TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder

Add code
Bookmark button
Alert button
Jun 30, 2022
Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim

Figure 1 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 2 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 3 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 4 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Viaarxiv icon

Practical cognitive speech compression

Add code
Bookmark button
Alert button
Mar 08, 2022
Reza Lotfidereshgi, Philippe Gournay

Figure 1 for Practical cognitive speech compression
Figure 2 for Practical cognitive speech compression
Figure 3 for Practical cognitive speech compression
Figure 4 for Practical cognitive speech compression
Viaarxiv icon

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

Add code
Bookmark button
Alert button
Apr 08, 2022
Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao

Figure 1 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 2 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 3 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 4 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Viaarxiv icon