Alert button

"speech recognition": models, code, and papers
Alert button

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

Jan 05, 2024
Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-yi Lee, Ariya Rastrow, Andreas Stolcke

Viaarxiv icon

The Art of Deception: Robust Backdoor Attack using Dynamic Stacking of Triggers

Jan 03, 2024
Orson Mengara

Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

Towards Online Sign Language Recognition and Translation

Jan 10, 2024
Ronglai Zuo, Fangyun Wei, Brian Mak

Viaarxiv icon

Ms-senet: Enhancing Speech Emotion Recognition Through Multi-scale Feature Fusion With Squeeze-and-excitation Blocks

Dec 25, 2023
Mengbo Li, Yuanzhong Zheng, Dichucheng Li, Yulun Wu, Yaoxuan Wang, Haojun Fei

Viaarxiv icon

Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

Nov 06, 2023
Rabindra Nath Nandi, Mehadi Hasan Menon, Tareq Al Muntasir, Sagor Sarker, Quazi Sarwar Muhtaseem, Md. Tariqul Islam, Shammur Absar Chowdhury, Firoj Alam

Viaarxiv icon

Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation

Nov 09, 2023
Zhaofeng Lin, Tanvina Patel, Odette Scharenborg

Viaarxiv icon

Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR Models

Dec 21, 2023
Christopher Simic, Tobias Bocklet

Viaarxiv icon

TeLeS: Temporal Lexeme Similarity Score to Estimate Confidence in End-to-End ASR

Jan 06, 2024
Nagarathna Ravi, Thishyan Raj T, Vipul Arora

Viaarxiv icon

Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors

Oct 25, 2023
Marek Kubis, Paweł Skórzewski, Marcin Sowański, Tomasz Ziętkiewicz

Figure 1 for Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Figure 2 for Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Figure 3 for Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Figure 4 for Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Viaarxiv icon