Alert button

"speech recognition": models, code, and papers
Alert button

ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech Corpus

Mar 27, 2024
Injy Hamed, Fadhl Eryani, David Palfreyman, Nizar Habash

Viaarxiv icon

emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition

Mar 21, 2024
Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Berrak Sisman, Bjorn W. Schuller, Carlos Busso

Figure 1 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 2 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 3 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 4 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Viaarxiv icon

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

Mar 26, 2024
Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Figure 1 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 2 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 3 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 4 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Viaarxiv icon

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

Mar 25, 2024
Tsendsuren Munkhdalai, Youzheng Chen, Khe Chai Sim, Fadi Biadsy, Tara Sainath, Pedro Moreno Mengibar

Viaarxiv icon

Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT

Mar 25, 2024
Rohit Raju, Peeta Basa Pati, SA Gandheesh, Gayatri Sanjana Sannala, Suriya KS

Viaarxiv icon

Accuracy enhancement method for speech emotion recognition from spectrogram using temporal frequency correlation and positional information learning through knowledge transfer

Mar 26, 2024
Jeong-Yoon Kim, Seung-Ho Lee

Viaarxiv icon

An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement

Feb 27, 2024
Tzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, Chi-Han Lin, Berlin Chen

Viaarxiv icon

Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems

Feb 29, 2024
Quentin Raymondaud, Mickael Rouvier, Richard Dufour

Viaarxiv icon

A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition

Add code
Bookmark button
Alert button
Mar 02, 2024
Tyler Benster, Guy Wilson, Reshef Elisha, Francis R Willett, Shaul Druckmann

Figure 1 for A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Figure 2 for A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Figure 3 for A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Figure 4 for A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Viaarxiv icon

Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism

Add code
Bookmark button
Alert button
Mar 07, 2024
Xiaoyu Tang, Yixin Lin, Ting Dang, Yuanfang Zhang, Jintao Cheng

Figure 1 for Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism
Figure 2 for Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism
Figure 3 for Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism
Figure 4 for Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism
Viaarxiv icon