Alert button

"speech recognition": models, code, and papers
Alert button

Adaptive Feature Fusion: Enhancing Generalization in Deep Learning Models

Apr 04, 2023
Neelesh Mungoli

Viaarxiv icon

Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees

Oct 08, 2021
Yuanchao Wang, Wenji Du, Chenghao Cai, Yanyan Xu

Figure 1 for Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Figure 2 for Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Figure 3 for Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Figure 4 for Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Viaarxiv icon

DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition

Jul 06, 2023
Zhifeng Wang, Chunyan Zeng, Surong Duan, Hongjie Ouyang, Hongmin Xu

Figure 1 for DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition
Figure 2 for DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition
Figure 3 for DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition
Figure 4 for DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition
Viaarxiv icon

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

Mar 12, 2023
Salah Zaiem, Robin Algayres, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Figure 1 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Figure 2 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Figure 3 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Viaarxiv icon

Improving the Intent Classification accuracy in Noisy Environment

Mar 12, 2023
Mohamed Nabih Ali, Alessio Brutti, Daniele Falavigna

Figure 1 for Improving the Intent Classification accuracy in Noisy Environment
Figure 2 for Improving the Intent Classification accuracy in Noisy Environment
Figure 3 for Improving the Intent Classification accuracy in Noisy Environment
Figure 4 for Improving the Intent Classification accuracy in Noisy Environment
Viaarxiv icon

A Conformer Based Acoustic Model for Robust Automatic Speech Recognition

Mar 20, 2022
Yufeng Yang, Peidong Wang, DeLiang Wang

Figure 1 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 2 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 3 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 4 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Viaarxiv icon

Modulation spectral features for speech emotion recognition using deep neural networks

Jan 14, 2023
Premjeet Singh, Md Sahidullah, Goutam Saha

Figure 1 for Modulation spectral features for speech emotion recognition using deep neural networks
Figure 2 for Modulation spectral features for speech emotion recognition using deep neural networks
Figure 3 for Modulation spectral features for speech emotion recognition using deep neural networks
Figure 4 for Modulation spectral features for speech emotion recognition using deep neural networks
Viaarxiv icon

Pushing the Limits of Non-Autoregressive Speech Recognition

Apr 12, 2021
Edwin G. Ng, Chung-Cheng Chiu, Yu Zhang, William Chan

Figure 1 for Pushing the Limits of Non-Autoregressive Speech Recognition
Figure 2 for Pushing the Limits of Non-Autoregressive Speech Recognition
Figure 3 for Pushing the Limits of Non-Autoregressive Speech Recognition
Figure 4 for Pushing the Limits of Non-Autoregressive Speech Recognition
Viaarxiv icon

Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition

Oct 11, 2021
Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng

Figure 1 for Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Figure 2 for Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Figure 3 for Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Figure 4 for Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Viaarxiv icon

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Feb 23, 2023
Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

Figure 1 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 2 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 3 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 4 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Viaarxiv icon