Alert button

"speech": models, code, and papers
Alert button

Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models

Add code
Bookmark button
Alert button
May 27, 2023
Yusheng Tian, Guangyan Zhang, Tan Lee

Figure 1 for Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models
Figure 2 for Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models
Figure 3 for Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models
Figure 4 for Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models
Viaarxiv icon

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Jun 18, 2023
Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng

Figure 1 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 2 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 3 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 4 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Viaarxiv icon

Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model

Add code
Bookmark button
Alert button
Apr 24, 2023
Kenichi Fujita, Takanori Ashihara, Hiroki Kanagawa, Takafumi Moriya, Yusuke Ijima

Figure 1 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 2 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 3 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 4 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Viaarxiv icon

N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition

Add code
Bookmark button
Alert button
Jun 05, 2023
Bashar Talafha, Abdul Waheed, Muhammad Abdul-Mageed

Figure 1 for N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition
Figure 2 for N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition
Figure 3 for N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition
Viaarxiv icon

DUB: Discrete Unit Back-translation for Speech Translation

Add code
Bookmark button
Alert button
May 19, 2023
Dong Zhang, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou

Figure 1 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 2 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 3 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 4 for DUB: Discrete Unit Back-translation for Speech Translation
Viaarxiv icon

Beyond Fairness: Age-Harmless Parkinson's Detection via Voice

Sep 23, 2023
Yicheng Wang, Xiaotian Han, Leisheng Yu, Na Zou

Viaarxiv icon

Towards generalisable and calibrated synthetic speech detection with self-supervised representations

Sep 11, 2023
Dan Oneata, Adriana Stan, Octavian Pascu, Elisabeta Oneata, Horia Cucu

Figure 1 for Towards generalisable and calibrated synthetic speech detection with self-supervised representations
Figure 2 for Towards generalisable and calibrated synthetic speech detection with self-supervised representations
Figure 3 for Towards generalisable and calibrated synthetic speech detection with self-supervised representations
Figure 4 for Towards generalisable and calibrated synthetic speech detection with self-supervised representations
Viaarxiv icon

An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech

Add code
Bookmark button
Alert button
Jun 04, 2023
Badr M. Abdullah, Mohammed Maqsood Shaik, Bernd Möbius, Dietrich Klakow

Figure 1 for An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech
Figure 2 for An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech
Figure 3 for An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech
Figure 4 for An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech
Viaarxiv icon

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

Sep 13, 2023
Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet

Figure 1 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 2 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 3 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 4 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Viaarxiv icon

VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research

Add code
Bookmark button
Alert button
Sep 14, 2023
Sarina Meyer, Xiaoxiao Miao, Ngoc Thang Vu

Figure 1 for VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Figure 2 for VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Figure 3 for VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Figure 4 for VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Viaarxiv icon