Alert button

"speech recognition": models, code, and papers
Alert button

End to End ASR System with Automatic Punctuation Insertion

Dec 03, 2020
Yushi Guan

Figure 1 for End to End ASR System with Automatic Punctuation Insertion
Figure 2 for End to End ASR System with Automatic Punctuation Insertion
Figure 3 for End to End ASR System with Automatic Punctuation Insertion
Figure 4 for End to End ASR System with Automatic Punctuation Insertion
Viaarxiv icon

A Decidability-Based Loss Function

Sep 12, 2021
Pedro Silva, Gladston Moreira, Vander Freitas, Rodrigo Silva, David Menotti, Eduardo Luz

Figure 1 for A Decidability-Based Loss Function
Figure 2 for A Decidability-Based Loss Function
Figure 3 for A Decidability-Based Loss Function
Figure 4 for A Decidability-Based Loss Function
Viaarxiv icon

Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition

Dec 26, 2021
Ismail Shahin, Noor Hindawi, Ali Bou Nassif, Adi Alhudhaif, Kemal Polat

Figure 1 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 2 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 3 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 4 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Viaarxiv icon

Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training

Oct 20, 2021
Chenyang Gao, Yue Gu, Ivan Marsic

Figure 1 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 2 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 3 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 4 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Viaarxiv icon

ISyNet: Convolutional Neural Networks design for AI accelerator

Sep 04, 2021
Alexey Letunovskiy, Vladimir Korviakov, Vladimir Polovnikov, Anastasiia Kargapoltseva, Ivan Mazurenko, Yepan Xiong

Figure 1 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 2 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 3 for ISyNet: Convolutional Neural Networks design for AI accelerator
Figure 4 for ISyNet: Convolutional Neural Networks design for AI accelerator
Viaarxiv icon

End-to-end speech-to-dialog-act recognition

Apr 23, 2020
Viet-Trung Dang, Tianyu Zhao, Sei Ueno, Hirofumi Inaguma, Tatsuya Kawahara

Figure 1 for End-to-end speech-to-dialog-act recognition
Figure 2 for End-to-end speech-to-dialog-act recognition
Figure 3 for End-to-end speech-to-dialog-act recognition
Figure 4 for End-to-end speech-to-dialog-act recognition
Viaarxiv icon

A Study on Lip Localization Techniques used for Lip reading from a Video

Sep 28, 2020
S. D. Lalitha, K. K. Thyagharajan

Figure 1 for A Study on Lip Localization Techniques used for Lip reading from a Video
Figure 2 for A Study on Lip Localization Techniques used for Lip reading from a Video
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Apr 27, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition

Mar 29, 2017
Albert Zeyer, Patrick Doetsch, Paul Voigtlaender, Ralf Schlüter, Hermann Ney

Figure 1 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 2 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 3 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 4 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Viaarxiv icon

Improving callsign recognition with air-surveillance data in air-traffic communication

Aug 27, 2021
Iuliia Nigmatulina, Rudolf Braun, Juan Zuluaga-Gomez, Petr Motlicek

Figure 1 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 2 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 3 for Improving callsign recognition with air-surveillance data in air-traffic communication
Figure 4 for Improving callsign recognition with air-surveillance data in air-traffic communication
Viaarxiv icon