Alert button

"speech recognition": models, code, and papers
Alert button

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

Jan 05, 2024
Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-yi Lee, Ariya Rastrow, Andreas Stolcke

Viaarxiv icon

The Art of Deception: Robust Backdoor Attack using Dynamic Stacking of Triggers

Jan 03, 2024
Orson Mengara

Viaarxiv icon

ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis

Jan 16, 2024
Haobin Tang, Xulong Zhang, Ning Cheng, Jing Xiao, Jianzong Wang

Viaarxiv icon

End-to-End Speech Recognition Contextualization with Large Language Models

Sep 19, 2023
Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen

Figure 1 for End-to-End Speech Recognition Contextualization with Large Language Models
Figure 2 for End-to-End Speech Recognition Contextualization with Large Language Models
Figure 3 for End-to-End Speech Recognition Contextualization with Large Language Models
Figure 4 for End-to-End Speech Recognition Contextualization with Large Language Models
Viaarxiv icon

Gaussian Adaptive Attention is All You Need: Robust Contextual Representations Across Multiple Modalities

Jan 31, 2024
Georgios Ioannides, Aman Chadha, Aaron Elkins

Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

Towards Online Sign Language Recognition and Translation

Add code
Bookmark button
Alert button
Jan 10, 2024
Ronglai Zuo, Fangyun Wei, Brian Mak

Viaarxiv icon

Multimodal Speech Emotion Recognition Using Modality-specific Self-Supervised Frameworks

Dec 04, 2023
Rutherford Agbeshi Patamia, Paulo E. Santos, Kingsley Nketia Acheampong, Favour Ekong, Kwabena Sarpong, She Kun

Viaarxiv icon

Keyword spotting -- Detecting commands in speech using deep learning

Dec 09, 2023
Sumedha Rai, Tong Li, Bella Lyu

Figure 1 for Keyword spotting -- Detecting commands in speech using deep learning
Figure 2 for Keyword spotting -- Detecting commands in speech using deep learning
Figure 3 for Keyword spotting -- Detecting commands in speech using deep learning
Figure 4 for Keyword spotting -- Detecting commands in speech using deep learning
Viaarxiv icon

SpokesBiz -- an Open Corpus of Conversational Polish

Dec 19, 2023
Piotr Pęzik, Sylwia Karasińska, Anna Cichosz, Łukasz Jałowiecki, Konrad Kaczyński, Małgorzata Krawentek, Karolina Walkusz, Paweł Wilk, Mariusz Kleć, Krzysztof Szklanny, Szymon Marszałkowski

Viaarxiv icon