Alert button

"speech recognition": models, code, and papers
Alert button

Visualizing Deep Neural Networks for Speech Recognition with Learned Topographic Filter Maps

Dec 06, 2019
Andreas Krug, Sebastian Stober

Figure 1 for Visualizing Deep Neural Networks for Speech Recognition with Learned Topographic Filter Maps
Viaarxiv icon

End-to-end model for named entity recognition from speech without paired training data

Add code
Bookmark button
Alert button
Apr 02, 2022
Salima Mdhaffar, Jarod Duret, Titouan Parcollet, Yannick Estève

Figure 1 for End-to-end model for named entity recognition from speech without paired training data
Figure 2 for End-to-end model for named entity recognition from speech without paired training data
Figure 3 for End-to-end model for named entity recognition from speech without paired training data
Figure 4 for End-to-end model for named entity recognition from speech without paired training data
Viaarxiv icon

Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition

Feb 27, 2017
Zhiyuan Tang, Ying Shi, Dong Wang, Yang Feng, Shiyue Zhang

Figure 1 for Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Figure 2 for Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Figure 3 for Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Figure 4 for Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Viaarxiv icon

Synthetic Dataset Generation for Privacy-Preserving Machine Learning

Oct 10, 2022
Efstathia Soufleri, Gobinda Saha, Kaushik Roy

Figure 1 for Synthetic Dataset Generation for Privacy-Preserving Machine Learning
Figure 2 for Synthetic Dataset Generation for Privacy-Preserving Machine Learning
Figure 3 for Synthetic Dataset Generation for Privacy-Preserving Machine Learning
Figure 4 for Synthetic Dataset Generation for Privacy-Preserving Machine Learning
Viaarxiv icon

An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models

Sep 14, 2019
Khe Chai Sim, Petr Zadrazil, Françoise Beaufays

Figure 1 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 2 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 3 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 4 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Viaarxiv icon

Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts

Add code
Bookmark button
Alert button
Sep 27, 2022
Vincent Karas, Andreas Triantafyllopoulos, Meishu Song, Björn W. Schuller

Figure 1 for Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Figure 2 for Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Figure 3 for Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Viaarxiv icon

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

Add code
Bookmark button
Alert button
Dec 10, 2020
Binbin Zhang, Di Wu, Zhuoyuan Yao, Xiong Wang, Fan Yu, Chao Yang, Liyong Guo, Yaguang Hu, Lei Xie, Xin Lei

Figure 1 for Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Figure 2 for Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Figure 3 for Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Figure 4 for Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Viaarxiv icon

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition

Oct 29, 2020
Yangyang Shi, Yongqiang Wang, Chunyang Wu, Ching-Feng Yeh, Julian Chan, Frank Zhang, Duc Le, Mike Seltzer

Figure 1 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 2 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 3 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 4 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Viaarxiv icon

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

Oct 23, 2020
Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman

Figure 1 for Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 2 for Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 3 for Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 4 for Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Viaarxiv icon

Improving speech recognition by revising gated recurrent units

Add code
Bookmark button
Alert button
Sep 29, 2017
Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

Figure 1 for Improving speech recognition by revising gated recurrent units
Figure 2 for Improving speech recognition by revising gated recurrent units
Figure 3 for Improving speech recognition by revising gated recurrent units
Figure 4 for Improving speech recognition by revising gated recurrent units
Viaarxiv icon