Alert button

"speech recognition": models, code, and papers
Alert button

A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings

Feb 15, 2024
Hyewon Han, Naveen Kumar

Viaarxiv icon

VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Mar 06, 2024
Vu Tran, Ha-Thanh Nguyen, Trung Vo, Son T. Luu, Hoang-Anh Dang, Ngoc-Cam Le, Thi-Thuy Le, Minh-Tien Nguyen, Truong-Son Nguyen, Le-Minh Nguyen

Figure 1 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 2 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 3 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 4 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Viaarxiv icon

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition

Feb 04, 2024
Alexandra Saliba, Yuanchao Li, Ramon Sanabria, Catherine Lai

Viaarxiv icon

Digits micro-model for accurate and secure transactions

Feb 02, 2024
Chirag Chhablani, Nikhita Sharma, Jordan Hosier, Vijay K. Gurbani

Viaarxiv icon

Ain't Misbehavin' -- Using LLMs to Generate Expressive Robot Behavior in Conversations with the Tabletop Robot Haru

Feb 18, 2024
Zining Wang, Paul Reisert, Eric Nichols, Randy Gomez

Viaarxiv icon

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition

Feb 02, 2024
Yi Chang, Zhao Ren, Zixing Zhang, Xin Jing, Kun Qian, Xi Shao, Bin Hu, Tanja Schultz, Björn W. Schuller

Viaarxiv icon

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Feb 13, 2024
Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen

Viaarxiv icon

Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition

Jan 19, 2024
Ismail Rasim Ulgen, Zongyang Du, Carlos Busso, Berrak Sisman

Viaarxiv icon

Leveraging Language ID to Calculate Intermediate CTC Loss for Enhanced Code-Switching Speech Recognition

Dec 15, 2023
Tzu-Ting Yang, Hsin-Wei Wang, Berlin Chen

Viaarxiv icon

Streaming Sequence Transduction through Dynamic Compression

Feb 02, 2024
Weiting Tan, Yunmo Chen, Tongfei Chen, Guanghui Qin, Haoran Xu, Heidi C. Zhang, Benjamin Van Durme, Philipp Koehn

Viaarxiv icon