Alert button

"speech recognition": models, code, and papers
Alert button

Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition

Nov 29, 2021
Junhao Xu, Jianwei Yu, Xunying Liu, Helen Meng

Figure 1 for Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition
Figure 2 for Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition
Figure 3 for Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition
Figure 4 for Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition
Viaarxiv icon

Attention-based Multi-hypothesis Fusion for Speech Summarization

Add code
Bookmark button
Alert button
Nov 16, 2021
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe

Figure 1 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 2 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 3 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 4 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Viaarxiv icon

Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent

Add code
Bookmark button
Alert button
Dec 02, 2021
Wei Zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu

Figure 1 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 2 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 3 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 4 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Viaarxiv icon

Binary classification of spoken words with passive elastic metastructures

Nov 14, 2021
Tena Dubček, Daniel Moreno-Garcia, Thomas Haag, Henrik R. Thomsen, Theodor S. Becker, Christoph Bärlocher, Fredrik Andersson, Sebastian D. Huber, Dirk-Jan van Manen, Luis Guillermo Villanueva, Johan O. A. Robertsson, Marc Serra-Garcia

Figure 1 for Binary classification of spoken words with passive elastic metastructures
Figure 2 for Binary classification of spoken words with passive elastic metastructures
Figure 3 for Binary classification of spoken words with passive elastic metastructures
Figure 4 for Binary classification of spoken words with passive elastic metastructures
Viaarxiv icon

Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training

Oct 21, 2020
Alex Wilf, Emily Mower Provost

Figure 1 for Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Add code
Bookmark button
Alert button
Apr 27, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

Add code
Bookmark button
Alert button
Apr 03, 2021
Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang

Figure 1 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 2 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 3 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 4 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Viaarxiv icon

Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning

Jul 03, 2020
Pavel Denisov, Ngoc Thang Vu

Figure 1 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 2 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 3 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 4 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Viaarxiv icon

Knowing What to Listen to: Early Attention for Deep Speech Representation Learning

Sep 03, 2020
Amirhossein Hajavi, Ali Etemad

Figure 1 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 2 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 3 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 4 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Viaarxiv icon

Speech Emotion Recognition with Dual-Sequence LSTM Architecture

Oct 20, 2019
Jianyou Wang, Michael Xue, Ryan Culhane, Enmao Diao, Jie Ding, Vahid Tarokh

Figure 1 for Speech Emotion Recognition with Dual-Sequence LSTM Architecture
Figure 2 for Speech Emotion Recognition with Dual-Sequence LSTM Architecture
Figure 3 for Speech Emotion Recognition with Dual-Sequence LSTM Architecture
Viaarxiv icon