Alert button

"speech": models, code, and papers
Alert button

Deep Graph Random Process for Relational-Thinking-Based Speech Recognition

Jul 08, 2020
Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang

Figure 1 for Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Figure 2 for Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Figure 3 for Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Figure 4 for Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Viaarxiv icon

Communication conditions in virtual acoustic scenes in an underground station

Jun 30, 2021
Ľuboš Hládek, Stephan D. Ewert, Bernhard U. Seeber

Figure 1 for Communication conditions in virtual acoustic scenes in an underground station
Figure 2 for Communication conditions in virtual acoustic scenes in an underground station
Figure 3 for Communication conditions in virtual acoustic scenes in an underground station
Figure 4 for Communication conditions in virtual acoustic scenes in an underground station
Viaarxiv icon

TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations

Dec 23, 2021
Xin Tian, Xinxian Huang, Dongfeng He, Yingzhan Lin, Siqi Bao, Huang He, Liankai Huang, Qiang Ju, Xiyuan Zhang, Jian Xie, Shuqi Sun, Fan Wang, Hua Wu, Haifeng Wang

Figure 1 for TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations
Figure 2 for TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations
Figure 3 for TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations
Figure 4 for TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations
Viaarxiv icon

Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache

Jun 11, 2021
René Peinl

Viaarxiv icon

InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition

Dec 23, 2021
Andreea Glavan, Estefania Talavera

Figure 1 for InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
Figure 2 for InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
Figure 3 for InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
Figure 4 for InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
Viaarxiv icon

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition

Oct 22, 2020
Yangyang Shi, Yongqiang Wang, Chunyang Wu, Ching-Feng Yeh, Julian Chan, Frank Zhang, Duc Le, Mike Seltzer

Figure 1 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 2 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 3 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 4 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Viaarxiv icon

Improved Noisy Student Training for Automatic Speech Recognition

May 19, 2020
Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le

Figure 1 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 2 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 3 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 4 for Improved Noisy Student Training for Automatic Speech Recognition
Viaarxiv icon

Detecting Parkinson's Disease from Speech-task in an accessible and interpretable manner

Sep 02, 2020
Wasifur Rahman, Sangwu Lee, Md. Saiful Islam, Abdullah Al Mamun, Victor Antony, Harshil Ratnu, Mohammad Rafayet Ali, Ehsan Hoque

Figure 1 for Detecting Parkinson's Disease from Speech-task in an accessible and interpretable manner
Figure 2 for Detecting Parkinson's Disease from Speech-task in an accessible and interpretable manner
Figure 3 for Detecting Parkinson's Disease from Speech-task in an accessible and interpretable manner
Figure 4 for Detecting Parkinson's Disease from Speech-task in an accessible and interpretable manner
Viaarxiv icon

Speaker diarization assisted ASR for multi-speaker conversations

Apr 05, 2021
Srikanth Raj Chetupalli, Sriram Ganapathy

Figure 1 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 2 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 3 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 4 for Speaker diarization assisted ASR for multi-speaker conversations
Viaarxiv icon

Captcha Attack: Turning Captchas Against Humanity

Jan 13, 2022
Mauro Conti, Luca Pajola, Pier Paolo Tricomi

Figure 1 for Captcha Attack: Turning Captchas Against Humanity
Figure 2 for Captcha Attack: Turning Captchas Against Humanity
Figure 3 for Captcha Attack: Turning Captchas Against Humanity
Figure 4 for Captcha Attack: Turning Captchas Against Humanity
Viaarxiv icon