Alert button

"speech recognition": models, code, and papers
Alert button

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

Jan 18, 2024
Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng

Viaarxiv icon

Hourglass-AVSR: Down-Up Sampling-based Computational Efficiency Model for Audio-Visual Speech Recognition

Dec 14, 2023
Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang

Viaarxiv icon

Punctuation Restoration Improves Structure Understanding without Supervision

Feb 21, 2024
Junghyun Min, Minho Lee, Woochul Lee, Yeonsoo Lee

Viaarxiv icon

Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition

Dec 20, 2023
Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Viaarxiv icon

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Jan 28, 2024
Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren

Viaarxiv icon

On Speaker Attribution with SURT

Jan 28, 2024
Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola Garcia-Perera, Daniel Povey, Sanjeev Khudanpur

Viaarxiv icon

Personalized Large Language Models

Feb 14, 2024
Stanisław Woźniak, Bartłomiej Koptyra, Arkadiusz Janz, Przemysław Kazienko, Jan Kocoń

Viaarxiv icon

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases

Feb 01, 2024
Giulio Zhou, Tsz Kin Lam, Alexandra Birch, Barry Haddow

Viaarxiv icon

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation

Jan 07, 2024
Qiushi Zhu, Jie Zhang, Yu Gu, Yuchen Hu, Lirong Dai

Viaarxiv icon

CNN architecture extraction on edge GPU

Jan 24, 2024
Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom

Viaarxiv icon