Alert button

"speech recognition": models, code, and papers
Alert button

Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper

Add code
Bookmark button
Alert button
Sep 01, 2023
Tomasz Wojnar, Jaroslaw Hryszko, Adam Roman

Figure 1 for Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper
Figure 2 for Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper
Figure 3 for Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper
Figure 4 for Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper
Viaarxiv icon

FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge

Dec 15, 2023
Jiahe Lan, Jie Wang, Baochen Yan, Zheng Yan, Elisa Bertino

Viaarxiv icon

Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training

Add code
Bookmark button
Alert button
Dec 03, 2023
Sean Robertson, Ewan Dunbar

Figure 1 for Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training
Figure 2 for Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training
Figure 3 for Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training
Figure 4 for Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training
Viaarxiv icon

Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition

Sep 20, 2023
Ahmed Amine Ben Abdallah, Ata Kabboudi, Amir Kanoun, Salah Zaiem

Figure 1 for Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition
Figure 2 for Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition
Figure 3 for Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition
Figure 4 for Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition
Viaarxiv icon

End-to-end Transfer Learning for Speaker-independent Cross-language Speech Emotion Recognition

Nov 22, 2023
Duowei Tang, Peter Kuppens, Luc Geurts, Toon van Waterschoot

Viaarxiv icon

SER_AMPEL: A multi-source dataset for SER of Italian older adults

Nov 24, 2023
Alessandra Grossi, Francesca Gasparini

Viaarxiv icon

Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project

Add code
Bookmark button
Alert button
Sep 26, 2023
Khai Le-Duc

Viaarxiv icon

Federated Learning with Differential Privacy for End-to-End Speech Recognition

Sep 29, 2023
Martin Pelikan, Sheikh Shams Azam, Vitaly Feldman, Jan "Honza" Silovsky, Kunal Talwar, Tatiana Likhomanenko

Viaarxiv icon

Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

Dec 06, 2023
Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Viaarxiv icon

CDSD: Chinese Dysarthria Speech Database

Oct 24, 2023
Mengyi Sun, Ming Gao, Xinchen Kang, Shiru Wang, Jun Du, Dengfeng Yao, Su-Jing Wang

Viaarxiv icon