Alert button

"speech recognition": models, code, and papers
Alert button

Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention

Add code
Bookmark button
Alert button
Jun 08, 2021
Zixuan Peng, Yu Lu, Shengfeng Pan, Yunfeng Liu

Figure 1 for Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention
Figure 2 for Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention
Figure 3 for Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention
Viaarxiv icon

Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Add code
Bookmark button
Alert button
Apr 04, 2022
Abner Hernandez, Paula Andrea Pérez-Toro, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang

Figure 1 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 2 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 3 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 4 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Viaarxiv icon

Automatic Spoken Language Identification using a Time-Delay Neural Network

May 19, 2022
Benjamin Kepecs, Homayoon Beigi

Figure 1 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 2 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 3 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 4 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Viaarxiv icon

Multi-layer Attention Mechanism for Speech Keyword Recognition

Jul 10, 2019
Ruisen Luo, Tianran Sun, Chen Wang, Miao Du, Zuodong Tang, Kai Zhou, Xiaofeng Gong, Xiaomei Yang

Figure 1 for Multi-layer Attention Mechanism for Speech Keyword Recognition
Viaarxiv icon

Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning

Nov 17, 2022
Brian Testa, Yi Xiao, Avery Gump, Asif Salekin

Figure 1 for Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
Figure 2 for Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
Figure 3 for Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
Figure 4 for Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
Viaarxiv icon

Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

Oct 31, 2016
Hagen Soltau, Hank Liao, Hasim Sak

Figure 1 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 2 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 3 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 4 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Viaarxiv icon

Improving RNN Transducer Modeling for End-to-End Speech Recognition

Sep 26, 2019
Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong

Figure 1 for Improving RNN Transducer Modeling for End-to-End Speech Recognition
Figure 2 for Improving RNN Transducer Modeling for End-to-End Speech Recognition
Figure 3 for Improving RNN Transducer Modeling for End-to-End Speech Recognition
Figure 4 for Improving RNN Transducer Modeling for End-to-End Speech Recognition
Viaarxiv icon

Speaker-adaptive Lip Reading with User-dependent Padding

Aug 09, 2022
Minsu Kim, Hyunjun Kim, Yong Man Ro

Figure 1 for Speaker-adaptive Lip Reading with User-dependent Padding
Figure 2 for Speaker-adaptive Lip Reading with User-dependent Padding
Figure 3 for Speaker-adaptive Lip Reading with User-dependent Padding
Figure 4 for Speaker-adaptive Lip Reading with User-dependent Padding
Viaarxiv icon

Bimodal Speech Emotion Recognition Using Pre-Trained Language Models

Nov 29, 2019
Verena Heusser, Niklas Freymuth, Stefan Constantin, Alex Waibel

Figure 1 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 2 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 3 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 4 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Viaarxiv icon

Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
May 19, 2020
George Sterpu, Christian Saam, Naomi Harte

Figure 1 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 2 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 3 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 4 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Viaarxiv icon