Alert button

"speech recognition": models, code, and papers
Alert button

The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems

Add code
Bookmark button
Alert button
Jul 28, 2023
Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse

Figure 1 for The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems
Figure 2 for The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems
Figure 3 for The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems
Figure 4 for The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems
Viaarxiv icon

Globally Normalising the Transducer for Streaming Speech Recognition

Jul 20, 2023
Rogier van Dalen

Figure 1 for Globally Normalising the Transducer for Streaming Speech Recognition
Figure 2 for Globally Normalising the Transducer for Streaming Speech Recognition
Figure 3 for Globally Normalising the Transducer for Streaming Speech Recognition
Figure 4 for Globally Normalising the Transducer for Streaming Speech Recognition
Viaarxiv icon

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation

Add code
Bookmark button
Alert button
Nov 01, 2023
Juan Zuluaga-Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sundararajan Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Viaarxiv icon

Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Sep 03, 2023
Yu-Wen Chen, Julia Hirschberg, Yu Tsao

Figure 1 for Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement
Figure 2 for Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement
Figure 3 for Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement
Figure 4 for Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement
Viaarxiv icon

Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound

Add code
Bookmark button
Alert button
Jul 17, 2023
Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Stefanos Koffas, Yiming Li

Figure 1 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 2 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 3 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 4 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Viaarxiv icon

Differential Evolution Algorithm based Hyper-Parameters Selection of Convolutional Neural Network for Speech Command Recognition

Add code
Bookmark button
Alert button
Oct 13, 2023
Sandipan Dhar, Anuvab Sen, Aritra Bandyopadhyay, Nanda Dulal Jana, Arjun Ghosh, Zahra Sarayloo

Viaarxiv icon

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Jul 10, 2023
Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Alexandros Haliassos, Stavros Petridis, Maja Pantic

Figure 1 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 2 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 3 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 4 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Viaarxiv icon

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

Add code
Bookmark button
Alert button
Jun 10, 2023
Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao

Figure 1 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 2 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 3 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 4 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Viaarxiv icon

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Jun 18, 2023
Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng

Figure 1 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 2 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 3 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Figure 4 for Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Viaarxiv icon

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Add code
Bookmark button
Alert button
Oct 09, 2023
Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe

Figure 1 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 2 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 3 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 4 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Viaarxiv icon