Alert button

"speech recognition": models, code, and papers
Alert button

Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability

Jun 03, 2021
Somnath Roy

Figure 1 for Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability
Figure 2 for Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability
Figure 3 for Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability
Figure 4 for Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability
Viaarxiv icon

EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding

Add code
Bookmark button
Alert button
Oct 18, 2015
Yajie Miao, Mohammad Gowayyed, Florian Metze

Figure 1 for EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Figure 2 for EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Figure 3 for EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Figure 4 for EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Viaarxiv icon

ISyNet: Convolutional Neural Networks design for AI accelerator

Sep 04, 2021
Alexey Letunovskiy, Vladimir Korviakov, Vladimir Polovnikov, Anastasiia Kargapoltseva, Ivan Mazurenko, Yepan Xiong

Figure 1 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 2 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 3 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 4 for ISyNet: Convolutional Neural Networks design for AI accelerator
Viaarxiv icon

Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training

Oct 20, 2021
Chenyang Gao, Yue Gu, Ivan Marsic

Figure 1 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 2 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 3 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 4 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Viaarxiv icon

Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla

Add code
Bookmark button
Alert button
May 31, 2021
Zabir Al Nazi, Sayed Mohammed Tasmimul Huda

Figure 1 for Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Add code
Bookmark button
Alert button
Apr 19, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

Improving callsign recognition with air-surveillance data in air-traffic communication

Add code
Bookmark button
Alert button
Aug 27, 2021
Iuliia Nigmatulina, Rudolf Braun, Juan Zuluaga-Gomez, Petr Motlicek

Figure 1 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 2 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 3 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 4 for Improving callsign recognition with air-surveillance data in air-traffic communication
Viaarxiv icon

Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition

Dec 26, 2021
Ismail Shahin, Noor Hindawi, Ali Bou Nassif, Adi Alhudhaif, Kemal Polat

Figure 1 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 2 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 3 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 4 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Viaarxiv icon

Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation

Aug 04, 2021
Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki

Figure 1 for Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation
Figure 2 for Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation
Viaarxiv icon

A Review of Speaker Diarization: Recent Advances with Deep Learning

Jan 24, 2021
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan

Figure 1 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 2 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 3 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 4 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Viaarxiv icon