Alert button

"speech recognition": models, code, and papers
Alert button

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition

Feb 02, 2024
Yi Chang, Zhao Ren, Zixing Zhang, Xin Jing, Kun Qian, Xi Shao, Bin Hu, Tanja Schultz, Björn W. Schuller

Viaarxiv icon

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Jan 03, 2024
Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Shivani Agrawal, Zhonglin Han, Jian Li, Amir Yazdanbakhsh

Figure 1 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 2 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 3 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 4 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Viaarxiv icon

Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition

Jan 19, 2024
Ismail Rasim Ulgen, Zongyang Du, Carlos Busso, Berrak Sisman

Viaarxiv icon

VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Mar 06, 2024
Vu Tran, Ha-Thanh Nguyen, Trung Vo, Son T. Luu, Hoang-Anh Dang, Ngoc-Cam Le, Thi-Thuy Le, Minh-Tien Nguyen, Truong-Son Nguyen, Le-Minh Nguyen

Figure 1 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 2 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 3 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Figure 4 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Viaarxiv icon

Ain't Misbehavin' -- Using LLMs to Generate Expressive Robot Behavior in Conversations with the Tabletop Robot Haru

Feb 18, 2024
Zining Wang, Paul Reisert, Eric Nichols, Randy Gomez

Viaarxiv icon

BANSpEmo: A Bangla Emotional Speech Recognition Dataset

Dec 21, 2023
Md Gulzar Hussain, Mahmuda Rahman, Babe Sultana, Ye Shiren

Viaarxiv icon

Streaming Sequence Transduction through Dynamic Compression

Feb 02, 2024
Weiting Tan, Yunmo Chen, Tongfei Chen, Guanghui Qin, Haoran Xu, Heidi C. Zhang, Benjamin Van Durme, Philipp Koehn

Viaarxiv icon

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Feb 13, 2024
Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen

Viaarxiv icon

Significance of Chirp MFCC as a Feature in Speech and Audio Applications

Feb 19, 2024
S. Johanan Joysingh, P. Vijayalakshmi, T. Nagarajan

Viaarxiv icon

Syllable based DNN-HMM Cantonese Speech to Text System

Feb 13, 2024
Timothy Wong, Claire Li, Sam Lam, Billy Chiu, Qin Lu, Minglei Li, Dan Xiong, Roy Shing Yu, Vincent T. Y. Ng

Viaarxiv icon