Alert button

"speech recognition": models, code, and papers
Alert button

Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN

Jul 24, 2023
Muhammad Danyal Khan, Raheem Ali, Arshad Aziz

Viaarxiv icon

Mixture Encoder for Joint Speech Separation and Recognition

Jun 21, 2023
Simon Berger, Peter Vieting, Christoph Boeddeker, Ralf Schlüter, Reinhold Haeb-Umbach

Figure 1 for Mixture Encoder for Joint Speech Separation and Recognition
Figure 2 for Mixture Encoder for Joint Speech Separation and Recognition
Figure 3 for Mixture Encoder for Joint Speech Separation and Recognition
Figure 4 for Mixture Encoder for Joint Speech Separation and Recognition
Viaarxiv icon

Leveraging Visemes for Better Visual Speech Representation and Lip Reading

Jul 19, 2023
Javad Peymanfard, Vahid Saeedi, Mohammad Reza Mohammadi, Hossein Zeinali, Nasser Mozayani

Figure 1 for Leveraging Visemes for Better Visual Speech Representation and Lip Reading
Figure 2 for Leveraging Visemes for Better Visual Speech Representation and Lip Reading
Figure 3 for Leveraging Visemes for Better Visual Speech Representation and Lip Reading
Viaarxiv icon

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition

Apr 05, 2023
Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree M. Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann

Figure 1 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 2 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 3 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 4 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Viaarxiv icon

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Mar 15, 2023
Steven M. Hernandez, Ding Zhao, Shaojin Ding, Antoine Bruguier, Rohit Prabhavalkar, Tara N. Sainath, Yanzhang He, Ian McGraw

Figure 1 for Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Figure 2 for Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Figure 3 for Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Figure 4 for Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Viaarxiv icon

Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition

Add code
Bookmark button
Alert button
Feb 02, 2023
Minglun Han, Qingyu Wang, Tielin Zhang, Yi Wang, Duzhen Zhang, Bo Xu

Figure 1 for Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition
Figure 2 for Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition
Figure 3 for Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition
Figure 4 for Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition
Viaarxiv icon

Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition

Jan 10, 2023
Piyush Behre, Sharman Tan, Padma Varadharajan, Shuangyu Chang

Figure 1 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 2 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 3 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 4 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Viaarxiv icon

Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques

Aug 04, 2023
Samiul Islam, Md. Maksudul Haque, Abu Jobayer Md. Sadat

Figure 1 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 2 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 3 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 4 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Viaarxiv icon

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

Apr 24, 2023
Mohan Li, Rama Doddipatla, Catalin Zorila

Figure 1 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 2 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 3 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 4 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Viaarxiv icon