Alert button

"speech recognition": models, code, and papers
Alert button

Improved training for online end-to-end speech recognition systems

Aug 30, 2018
Suyoun Kim, Michael L. Seltzer, Jinyu Li, Rui Zhao

Figure 1 for Improved training for online end-to-end speech recognition systems
Figure 2 for Improved training for online end-to-end speech recognition systems
Figure 3 for Improved training for online end-to-end speech recognition systems
Viaarxiv icon

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Aug 05, 2021
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Figure 1 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 2 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 3 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 4 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Viaarxiv icon

Robustness Testing of Data and Knowledge Driven Anomaly Detection in Cyber-Physical Systems

Apr 20, 2022
Xugui Zhou, Maxfield Kouzel, Homa Alemzadeh

Figure 1 for Robustness Testing of Data and Knowledge Driven Anomaly Detection in Cyber-Physical Systems
Figure 2 for Robustness Testing of Data and Knowledge Driven Anomaly Detection in Cyber-Physical Systems
Figure 3 for Robustness Testing of Data and Knowledge Driven Anomaly Detection in Cyber-Physical Systems
Figure 4 for Robustness Testing of Data and Knowledge Driven Anomaly Detection in Cyber-Physical Systems
Viaarxiv icon

Learning-Based Data Storage [Vision] (Technical Report)

Jun 12, 2022
Xiang Lian, Xiaofei Zhang

Figure 1 for Learning-Based Data Storage [Vision] (Technical Report)
Figure 2 for Learning-Based Data Storage [Vision] (Technical Report)
Figure 3 for Learning-Based Data Storage [Vision] (Technical Report)
Figure 4 for Learning-Based Data Storage [Vision] (Technical Report)
Viaarxiv icon

Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective

Apr 05, 2022
Yuying Xie, Thomas Arildsen, Zheng-Hua Tan

Figure 1 for Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective
Figure 2 for Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective
Figure 3 for Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective
Figure 4 for Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective
Viaarxiv icon

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Apr 19, 2022
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora

Figure 1 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 2 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 3 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 4 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Viaarxiv icon

Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning

Jan 31, 2017
Suyoun Kim, Takaaki Hori, Shinji Watanabe

Figure 1 for Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Figure 2 for Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Figure 3 for Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Figure 4 for Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Viaarxiv icon

Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition

Oct 30, 2018
Xinpei Zhou, Jiwei Li, Xi Zhou

Figure 1 for Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition
Figure 2 for Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition
Figure 3 for Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition
Figure 4 for Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition
Viaarxiv icon

On The Compensation Between Magnitude and Phase in Speech Separation

Aug 11, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

Figure 1 for On The Compensation Between Magnitude and Phase in Speech Separation
Figure 2 for On The Compensation Between Magnitude and Phase in Speech Separation
Figure 3 for On The Compensation Between Magnitude and Phase in Speech Separation
Figure 4 for On The Compensation Between Magnitude and Phase in Speech Separation
Viaarxiv icon

Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition

Oct 25, 2018
Ke Wang, Junbo Zhang, Sining Sun, Yujun Wang, Fei Xiang, Lei Xie

Figure 1 for Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
Figure 2 for Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
Figure 3 for Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
Figure 4 for Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
Viaarxiv icon