Alert button

"speech": models, code, and papers
Alert button

Sub-band Knowledge Distillation Framework for Speech Enhancement

May 29, 2020
Xiang Hao, Shixue Wen, Xiangdong Su, Yun Liu, Guanglai Gao, Xiaofei Li

Figure 1 for Sub-band Knowledge Distillation Framework for Speech Enhancement
Figure 2 for Sub-band Knowledge Distillation Framework for Speech Enhancement
Figure 3 for Sub-band Knowledge Distillation Framework for Speech Enhancement
Viaarxiv icon

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

Add code
Bookmark button
Alert button
May 29, 2019
Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Figure 1 for Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect
Figure 2 for Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect
Figure 3 for Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect
Figure 4 for Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect
Viaarxiv icon

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Add code
Bookmark button
Alert button
Apr 14, 2022
Zhao You, Shulin Feng, Dan Su, Dong Yu

Figure 1 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 2 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 3 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 4 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Viaarxiv icon

WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Add code
Bookmark button
Alert button
Mar 01, 2022
Jose A. Miranda, Esther Rituerto-González, Laura Gutiérrez-Martín, Clara Luis-Mingueza, Manuel F. Canabal, Alberto Ramírez Bárcenas, Jose M. Lanza-Gutiérrez, Carmen Peláez-Moreno, Celia López-Ongil

Figure 1 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset
Figure 2 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset
Figure 3 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset
Figure 4 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset
Viaarxiv icon

DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks

May 24, 2022
Lu Bai, Weixing Ji, Qinyuan Li, Xilai Yao, Wei Xin, Wanyi Zhu

Figure 1 for DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks
Figure 2 for DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks
Figure 3 for DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks
Figure 4 for DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks
Viaarxiv icon

Personalization Strategies for End-to-End Speech Recognition Systems

Feb 15, 2021
Aditya Gourav, Linda Liu, Ankur Gandhe, Yile Gu, Guitang Lan, Xiangyang Huang, Shashank Kalmane, Gautam Tiwari, Denis Filimonov, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko

Figure 1 for Personalization Strategies for End-to-End Speech Recognition Systems
Figure 2 for Personalization Strategies for End-to-End Speech Recognition Systems
Figure 3 for Personalization Strategies for End-to-End Speech Recognition Systems
Figure 4 for Personalization Strategies for End-to-End Speech Recognition Systems
Viaarxiv icon

A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database

Dec 08, 2019
Hossein Zeinali, Lukáš Burget, Jan "Honza'' Černocký

Figure 1 for A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database
Figure 2 for A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database
Figure 3 for A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database
Figure 4 for A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database
Viaarxiv icon

Detecting Unintended Memorization in Language-Model-Fused ASR

Apr 20, 2022
W. Ronny Huang, Steve Chien, Om Thakkar, Rajiv Mathews

Figure 1 for Detecting Unintended Memorization in Language-Model-Fused ASR
Figure 2 for Detecting Unintended Memorization in Language-Model-Fused ASR
Figure 3 for Detecting Unintended Memorization in Language-Model-Fused ASR
Figure 4 for Detecting Unintended Memorization in Language-Model-Fused ASR
Viaarxiv icon

SpeedySpeech: Efficient Neural Speech Synthesis

Add code
Bookmark button
Alert button
Aug 09, 2020
Jan Vainer, Ondřej Dušek

Figure 1 for SpeedySpeech: Efficient Neural Speech Synthesis
Figure 2 for SpeedySpeech: Efficient Neural Speech Synthesis
Figure 3 for SpeedySpeech: Efficient Neural Speech Synthesis
Figure 4 for SpeedySpeech: Efficient Neural Speech Synthesis
Viaarxiv icon

Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning

Mar 30, 2021
Shuiyang Mao, P. C. Ching, Tan Lee

Figure 1 for Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning
Figure 2 for Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning
Figure 3 for Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning
Figure 4 for Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning
Viaarxiv icon