Alert button

"speech": models, code, and papers
Alert button

VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System

Oct 25, 2023
Abdul Waheed, Bashar Talafha, Peter Suvellin, AbdelRahim Elmadany, Muhammad Abdul-Mageed

Figure 1 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 2 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 3 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 4 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Viaarxiv icon

Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Sep 29, 2023
Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen

Viaarxiv icon

RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation

Sep 29, 2023
Samuel Pegg, Kai Li, Xiaolin Hu

Figure 1 for RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation
Figure 2 for RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation
Figure 3 for RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation
Figure 4 for RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation
Viaarxiv icon

One-Class Knowledge Distillation for Spoofing Speech Detection

Sep 15, 2023
Jingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang

Figure 1 for One-Class Knowledge Distillation for Spoofing Speech Detection
Figure 2 for One-Class Knowledge Distillation for Spoofing Speech Detection
Figure 3 for One-Class Knowledge Distillation for Spoofing Speech Detection
Viaarxiv icon

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

Add code
Bookmark button
Alert button
Sep 27, 2023
Chunyu Qiang, Hao Li, Yixin Tian, Yi Zhao, Ying Zhang, Longbiao Wang, Jianwu Dang

Figure 1 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 2 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 3 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Viaarxiv icon

Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition

Sep 30, 2023
Dongyuan Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura

Viaarxiv icon

ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech

Add code
Bookmark button
Alert button
Sep 29, 2023
Wenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong

Figure 1 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 2 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 3 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 4 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Viaarxiv icon

Speech collage: code-switched audio generation by collaging monolingual corpora

Add code
Bookmark button
Alert button
Sep 27, 2023
Amir Hussein, Dorsa Zeinali, Ondřej Klejch, Matthew Wiesner, Brian Yan, Shammur Chowdhury, Ahmed Ali, Shinji Watanabe, Sanjeev Khudanpur

Figure 1 for Speech collage: code-switched audio generation by collaging monolingual corpora
Figure 2 for Speech collage: code-switched audio generation by collaging monolingual corpora
Figure 3 for Speech collage: code-switched audio generation by collaging monolingual corpora
Figure 4 for Speech collage: code-switched audio generation by collaging monolingual corpora
Viaarxiv icon

Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders

Add code
Bookmark button
Alert button
Sep 18, 2023
Lester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda

Figure 1 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 2 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 3 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 4 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Viaarxiv icon

Cross-Utterance Conditioned VAE for Speech Generation

Sep 08, 2023
Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun

Figure 1 for Cross-Utterance Conditioned VAE for Speech Generation
Figure 2 for Cross-Utterance Conditioned VAE for Speech Generation
Figure 3 for Cross-Utterance Conditioned VAE for Speech Generation
Figure 4 for Cross-Utterance Conditioned VAE for Speech Generation
Viaarxiv icon