Alert button

"speech recognition": models, code, and papers
Alert button

LanSER: Language-Model Supported Speech Emotion Recognition

Sep 07, 2023
Taesik Gong, Josh Belanich, Krishna Somandepalli, Arsha Nagrani, Brian Eoff, Brendan Jou

Figure 1 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 2 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 3 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 4 for LanSER: Language-Model Supported Speech Emotion Recognition
Viaarxiv icon

Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

Dec 06, 2023
Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Viaarxiv icon

1-step Speech Processing and Understanding Using CTC Loss

Add code
Bookmark button
Alert button
Nov 08, 2023
Karan Singla, Shahab Jalavand, Yeon-Jun Kim, Antonio Moreno Daniel, Srinivas Bangalore, Andrej Ljolje, Ben Stern

Viaarxiv icon

PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Dec 05, 2023
Tianshun Han, Shengnan Gui, Yiqing Huang, Baihui Li, Lijian Liu, Benjia Zhou, Ning Jiang, Quan Lu, Ruicong Zhi, Yanyan Liang, Du Zhang, Jun Wan

Viaarxiv icon

1SPU: 1-step Speech Processing Unit

Add code
Bookmark button
Alert button
Nov 10, 2023
Karan Singla, Shahab Jalalvand, Yeon-Jun Kim, Antonio Moreno Daniel, Srinivas Bangalore, Andrej Ljolje, Ben Stern

Viaarxiv icon

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Add code
Bookmark button
Alert button
Sep 18, 2023
George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Alessio Brutti

Figure 1 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 2 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 3 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 4 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Viaarxiv icon

FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning

Dec 07, 2023
Hossein Fereidooni, Alessandro Pegoraro, Phillip Rieger, Alexandra Dmitrienko, Ahmad-Reza Sadeghi

Figure 1 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 2 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 3 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 4 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Viaarxiv icon

Convoifilter: A case study of doing cocktail party speech recognition

Aug 22, 2023
Thai-Binh Nguyen, Alexander Waibel

Figure 1 for Convoifilter: A case study of doing cocktail party speech recognition
Viaarxiv icon

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Oct 18, 2023
Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Indonesian Automatic Speech Recognition with XLSR-53

Aug 20, 2023
Panji Arisaputra, Amalia Zahra

Figure 1 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 2 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 3 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 4 for Indonesian Automatic Speech Recognition with XLSR-53
Viaarxiv icon