Alert button

"speech": models, code, and papers
Alert button

Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition

Jun 30, 2022
Kai Zhen, Hieu Duy Nguyen, Raviteja Chinta, Nathan Susanj, Athanasios Mouchtaris, Tariq Afzal, Ariya Rastrow

Figure 1 for Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Figure 2 for Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Figure 3 for Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Figure 4 for Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Viaarxiv icon

Bangla hate speech detection on social media using attention-based recurrent neural network

Mar 31, 2022
Amit Kumar Das, Abdullah Al Asif, Anik Paul, Md. Nur Hossain

Figure 1 for Bangla hate speech detection on social media using attention-based recurrent neural network
Figure 2 for Bangla hate speech detection on social media using attention-based recurrent neural network
Figure 3 for Bangla hate speech detection on social media using attention-based recurrent neural network
Figure 4 for Bangla hate speech detection on social media using attention-based recurrent neural network
Viaarxiv icon

FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition

Jun 30, 2022
Szu-Jui Chen, Jiamin Xie, John H. L. Hansen

Figure 1 for FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Figure 2 for FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Figure 3 for FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Figure 4 for FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Viaarxiv icon

Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations

Dec 13, 2022
Mustafa Jarrar, Fadi A Zaraket, Tymaa Hammouda, Daanish Masood Alavi, Martin Waahlisch

Figure 1 for Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations
Figure 2 for Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations
Figure 3 for Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations
Figure 4 for Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations
Viaarxiv icon

How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models

Add code
Bookmark button
Alert button
Sep 12, 2022
Erik Ekstedt, Gabriel Skantze

Figure 1 for How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models
Figure 2 for How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models
Figure 3 for How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models
Figure 4 for How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models
Viaarxiv icon

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Add code
Bookmark button
Alert button
Apr 08, 2022
Qianying Liu, Yuhang Yang, Zhuo Gong, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Sadao Kurohashi

Figure 1 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 2 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 3 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 4 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Viaarxiv icon

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

Mar 31, 2022
Ayako Yamamoto, Toshio Irino, Shoko Araki, Kenichi Arai, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani

Figure 1 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 2 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 3 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 4 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Viaarxiv icon

Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning

Add code
Bookmark button
Alert button
Oct 18, 2021
Yi-Chen Chen, Shu-wen Yang, Cheng-Kuang Lee, Simon See, Hung-yi Lee

Figure 1 for Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Figure 2 for Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Viaarxiv icon

Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks

Add code
Bookmark button
Alert button
Dec 27, 2022
Erdong Guo, David Draper, Maria De Iorio

Figure 1 for Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks
Figure 2 for Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks
Figure 3 for Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks
Figure 4 for Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks
Viaarxiv icon

Supervised Attention in Sequence-to-Sequence Models for Speech Recognition

Apr 25, 2022
Gene-Ping Yang, Hao Tang

Figure 1 for Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Viaarxiv icon