Alert button

"speech recognition": models, code, and papers
Alert button

Robustness Analysis of Deep Learning Frameworks on Mobile Platforms

Sep 20, 2021
Amin Eslami Abyane, Hadi Hemmati

Figure 1 for Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
Figure 2 for Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
Figure 3 for Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
Figure 4 for Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
Viaarxiv icon

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

Aug 26, 2021
Fu-An Chao, Jeih-weih Hung, Berlin Chen

Figure 1 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 2 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 3 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 4 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Viaarxiv icon

MeetDot: Videoconferencing with Live Translation Captions

Sep 20, 2021
Arkady Arkhangorodsky, Christopher Chu, Scot Fang, Yiqi Huang, Denglin Jiang, Ajay Nagesh, Boliang Zhang, Kevin Knight

Figure 1 for MeetDot: Videoconferencing with Live Translation Captions
Figure 2 for MeetDot: Videoconferencing with Live Translation Captions
Figure 3 for MeetDot: Videoconferencing with Live Translation Captions
Figure 4 for MeetDot: Videoconferencing with Live Translation Captions
Viaarxiv icon

All-neural beamformer for continuous speech separation

Oct 13, 2021
Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez

Figure 1 for All-neural beamformer for continuous speech separation
Figure 2 for All-neural beamformer for continuous speech separation
Figure 3 for All-neural beamformer for continuous speech separation
Figure 4 for All-neural beamformer for continuous speech separation
Viaarxiv icon

Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data

Aug 24, 2022
Puneet Kumar, Sarthak Malik, Balasubramanian Raman

Figure 1 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 2 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 3 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 4 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Viaarxiv icon

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

Jun 14, 2021
Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

Figure 1 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 2 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 3 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 4 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Viaarxiv icon

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

Oct 12, 2021
Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi

Figure 1 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 2 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 3 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Viaarxiv icon

BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications

Oct 12, 2021
Juan Zuluaga-Gomez, Seyyed Saeed Sarfjoo, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Oliver Ohneiser, Hartmut Helmke

Figure 1 for BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications
Figure 2 for BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications
Figure 3 for BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications
Figure 4 for BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications
Viaarxiv icon

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training

Oct 12, 2021
Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu

Figure 1 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 2 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 3 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 4 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Viaarxiv icon

Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech

Jul 22, 2021
Duo Ma, Nana Hou, Van Tung Pham, Haihua Xu, Eng Siong Chng

Figure 1 for Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Figure 2 for Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Figure 3 for Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Figure 4 for Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Viaarxiv icon