Alert button

"speech recognition": models, code, and papers
Alert button

Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition

Feb 05, 2014
Haşim Sak, Andrew Senior, Françoise Beaufays

Figure 1 for Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Figure 2 for Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Figure 3 for Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Figure 4 for Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Viaarxiv icon

CMGAN: Conformer-based Metric GAN for Speech Enhancement

Add code
Bookmark button
Alert button
Mar 28, 2022
Ruizhe Cao, Sherif Abdulatif, Bin Yang

Figure 1 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Figure 2 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Figure 3 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Viaarxiv icon

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Apr 20, 2022
Shaojin Ding, Weiran Wang, Ding Zhao, Tara N. Sainath, Yanzhang He, Robert David, Rami Botros, Xin Wang, Rina Panigrahy, Qiao Liang, Dongseong Hwang, Ian McGraw, Rohit Prabhavalkar, Trevor Strohman

Figure 1 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 2 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 3 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 4 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Viaarxiv icon

Deep scattering network for speech emotion recognition

Add code
Bookmark button
Alert button
May 11, 2021
Premjeet Singh, Goutam Saha, Md Sahidullah

Figure 1 for Deep scattering network for speech emotion recognition
Figure 2 for Deep scattering network for speech emotion recognition
Figure 3 for Deep scattering network for speech emotion recognition
Figure 4 for Deep scattering network for speech emotion recognition
Viaarxiv icon

Multistage linguistic conditioning of convolutional layers for speech emotion recognition

Add code
Bookmark button
Alert button
Oct 13, 2021
Andreas Triantafyllopoulos, Uwe Reichel, Shuo Liu, Stephan Huber, Florian Eyben, Björn W. Schuller

Figure 1 for Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Figure 2 for Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Figure 3 for Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Figure 4 for Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Viaarxiv icon

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

Mar 31, 2022
Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan

Figure 1 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 2 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 3 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 4 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Viaarxiv icon

Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors

Apr 24, 2019
Nicholas Ruiz, Mattia Antonino Di Gangi, Nicola Bertoldi, Marcello Federico

Figure 1 for Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Figure 2 for Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Figure 3 for Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Figure 4 for Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Viaarxiv icon

Stream attention-based multi-array end-to-end speech recognition

Nov 12, 2018
Xiaofei Wang, Ruizhi Li, Sri Harish Mallid, Takaaki Hori, Shinji Watanabe, Hynek Hermansky

Figure 1 for Stream attention-based multi-array end-to-end speech recognition
Figure 2 for Stream attention-based multi-array end-to-end speech recognition
Figure 3 for Stream attention-based multi-array end-to-end speech recognition
Figure 4 for Stream attention-based multi-array end-to-end speech recognition
Viaarxiv icon

Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation

Apr 16, 2019
Chia-Hsuan Lee, Yun-Nung Chen, Hung-Yi Lee

Figure 1 for Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
Figure 2 for Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
Figure 3 for Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
Figure 4 for Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
Viaarxiv icon

Scaling ASR Improves Zero and Few Shot Learning

Nov 13, 2021
Alex Xiao, Weiyi Zheng, Gil Keren, Duc Le, Frank Zhang, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Abdelrahman Mohamed

Figure 1 for Scaling ASR Improves Zero and Few Shot Learning
Figure 2 for Scaling ASR Improves Zero and Few Shot Learning
Figure 3 for Scaling ASR Improves Zero and Few Shot Learning
Figure 4 for Scaling ASR Improves Zero and Few Shot Learning
Viaarxiv icon