Alert button

"speech recognition": models, code, and papers
Alert button

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

Sep 14, 2016
Peng Dai, Xue Teng, Frank Rudzicz, Ing Yann Soon

Figure 1 for An Adaptive Psychoacoustic Model for Automatic Speech Recognition
Figure 2 for An Adaptive Psychoacoustic Model for Automatic Speech Recognition
Figure 3 for An Adaptive Psychoacoustic Model for Automatic Speech Recognition
Figure 4 for An Adaptive Psychoacoustic Model for Automatic Speech Recognition
Viaarxiv icon

A practical two-stage training strategy for multi-stream end-to-end speech recognition

Oct 23, 2019
Ruizhi Li, Gregory Sell, Xiaofei Wang, Shinji Watanabe, Hynek Hermansky

Figure 1 for A practical two-stage training strategy for multi-stream end-to-end speech recognition
Figure 2 for A practical two-stage training strategy for multi-stream end-to-end speech recognition
Figure 3 for A practical two-stage training strategy for multi-stream end-to-end speech recognition
Figure 4 for A practical two-stage training strategy for multi-stream end-to-end speech recognition
Viaarxiv icon

Star Temporal Classification: Sequence Classification with Partially Labeled Data

Jan 28, 2022
Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 2 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 3 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 4 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Viaarxiv icon

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data

Jul 15, 2019
Kunal Dhawan, Ganji Sreeram, Kumar Priyadarshi, Rohit Sinha

Figure 1 for Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Figure 2 for Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Figure 3 for Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Figure 4 for Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Viaarxiv icon

Graph based manifold regularized deep neural networks for automatic speech recognition

Jun 19, 2016
Vikrant Singh Tomar, Richard C. Rose

Figure 1 for Graph based manifold regularized deep neural networks for automatic speech recognition
Figure 2 for Graph based manifold regularized deep neural networks for automatic speech recognition
Figure 3 for Graph based manifold regularized deep neural networks for automatic speech recognition
Figure 4 for Graph based manifold regularized deep neural networks for automatic speech recognition
Viaarxiv icon

A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

Sep 22, 2014
Roland Maas, Christian Huemmer, Armin Sehr, Walter Kellermann

Figure 1 for A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
Figure 2 for A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
Figure 3 for A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
Figure 4 for A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
Viaarxiv icon

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

May 29, 2022
Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji

Figure 1 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 2 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 3 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 4 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Viaarxiv icon

LegoNN: Building Modular Encoder-Decoder Models

Jun 07, 2022
Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed

Figure 1 for LegoNN: Building Modular Encoder-Decoder Models
Figure 2 for LegoNN: Building Modular Encoder-Decoder Models
Figure 3 for LegoNN: Building Modular Encoder-Decoder Models
Figure 4 for LegoNN: Building Modular Encoder-Decoder Models
Viaarxiv icon

Calibration of Phone Likelihoods in Automatic Speech Recognition

Jun 14, 2016
David A. van Leeuwen, Joost van Doremalen

Figure 1 for Calibration of Phone Likelihoods in Automatic Speech Recognition
Figure 2 for Calibration of Phone Likelihoods in Automatic Speech Recognition
Figure 3 for Calibration of Phone Likelihoods in Automatic Speech Recognition
Figure 4 for Calibration of Phone Likelihoods in Automatic Speech Recognition
Viaarxiv icon

Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Apr 17, 2019
Jiabin Xue, Jiqing Han, Tieran Zheng, Jiaxing Guo, Boyong Wu

Figure 1 for Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition
Figure 2 for Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition
Figure 3 for Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition
Figure 4 for Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition
Viaarxiv icon