Alert button

"speech": models, code, and papers
Alert button

ICASSP 2023 Speech Signal Improvement Challenge

Apr 02, 2023
Ross Cutler, Ando Saabas, Babak Naderi, Nicolae-Cătălin Ristea, Sebastian Braun, Solomiya Branets

Figure 1 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 2 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 3 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 4 for ICASSP 2023 Speech Signal Improvement Challenge
Viaarxiv icon

Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge

Add code
Bookmark button
Alert button
Jun 07, 2023
Wenhao Guan, Tao Li, Yishuang Li, Hukai Huang, Qingyang Hong, Lin Li

Figure 1 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 2 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 3 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 4 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Viaarxiv icon

A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Apr 22, 2023
Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

Figure 1 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 2 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 3 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 4 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Viaarxiv icon

a unified front-end framework for english text-to-speech synthesis

May 18, 2023
Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, YuanYuan Huo, Yuping Wang, Yuxuan Wang

Figure 1 for a unified front-end framework for english text-to-speech synthesis
Figure 2 for a unified front-end framework for english text-to-speech synthesis
Figure 3 for a unified front-end framework for english text-to-speech synthesis
Figure 4 for a unified front-end framework for english text-to-speech synthesis
Viaarxiv icon

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Add code
Bookmark button
Alert button
Aug 10, 2023
Haohe Liu, Qiao Tian, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley

Figure 1 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 2 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 3 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 4 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Viaarxiv icon

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center

Add code
Bookmark button
Alert button
May 23, 2023
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 2 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 3 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 4 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Viaarxiv icon

Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili

Jun 01, 2023
Christiaan Jacobs, Nathanaël Carraz Rakotonirina, Everlyn Asiko Chimoto, Bruce A. Bassett, Herman Kamper

Figure 1 for Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
Figure 2 for Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
Figure 3 for Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
Figure 4 for Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
Viaarxiv icon

RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting

Aug 31, 2023
Hui Wang, Shiwan Zhao, Xiguang Zheng, Yong Qin

Figure 1 for RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Figure 2 for RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Figure 3 for RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Figure 4 for RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Viaarxiv icon

Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation

Add code
Bookmark button
Alert button
May 17, 2023
Zhenxing Zhang, Lambert Schomaker

Figure 1 for Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation
Figure 2 for Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation
Figure 3 for Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation
Figure 4 for Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation
Viaarxiv icon

Confidence-based Ensembles of End-to-End Speech Recognition Models

Add code
Bookmark button
Alert button
Jun 27, 2023
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Figure 1 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 2 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 3 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 4 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Viaarxiv icon