Alert button

"speech": models, code, and papers
Alert button

TemporalAugmenter: An Ensemble Recurrent Based Deep Learning Approach for Signal Classification

Jan 13, 2024
Nelly Elsayed, Constantinos L. Zekios, Navid Asadizanjani, Zag ElSayed

Viaarxiv icon

XLS-R Deep Learning Model for Multilingual ASR on Low- Resource Languages: Indonesian, Javanese, and Sundanese

Jan 12, 2024
Panji Arisaputra, Alif Tri Handoyo, Amalia Zahra

Viaarxiv icon

FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition

Nov 29, 2023
Dongning Yang, Wei Wang, Yanmin Qian

Viaarxiv icon

An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis

Dec 08, 2023
Via Nielson, Steven Hillis

Viaarxiv icon

Maximum-Entropy Adversarial Audio Augmentation for Keyword Spotting

Jan 12, 2024
Zuzhao Ye, Gregory Ciccarelli, Brian Kulis

Viaarxiv icon

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

Add code
Bookmark button
Alert button
Dec 19, 2023
Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

Viaarxiv icon

Single-Microphone Speaker Separation and Voice Activity Detection in Noisy and Reverberant Environments

Add code
Bookmark button
Alert button
Jan 07, 2024
Renana Opochinsky, Mordehay Moradi, Sharon Gannot

Viaarxiv icon

Learning Disentangled Speech Representations

Nov 04, 2023
Yusuf Brima, Ulf Krumnack, Simone Pika, Gunther Heidemann

Viaarxiv icon

Automatic Restoration of Diacritics for Speech Data Sets

Nov 15, 2023
Sara Shatnawi, Sawsan Alqahtani, Hanan Aldarmaki

Viaarxiv icon

DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser

Nov 28, 2023
Peng Chen, Xiaobao Wei, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen

Viaarxiv icon