Alert button

"speech": models, code, and papers
Alert button

Useful Blunders: Can Automated Speech Recognition Errors Improve Downstream Dementia Classification?

Jan 10, 2024
Changye Li, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

Viaarxiv icon

Alternative Interfaces for Human-initiated Natural Language Communication and Robot-initiated Haptic Feedback: Towards Better Situational Awareness in Human-Robot Collaboration

Jan 25, 2024
Callum Bennie, Bridget Casey, Cecile Paris, Dana Kulic, Brendan Tidd, Nicholas Lawrance, Alex Pitt, Fletcher Talbot, Jason Williams, David Howard, Pavan Sikka, Hashini Senaratne

Viaarxiv icon

Efficiency-oriented approaches for self-supervised speech representation learning

Dec 18, 2023
Luis Lugo, Valentin Vielzeuf

Viaarxiv icon

Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution

Jan 30, 2024
Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara

Viaarxiv icon

SELM: Speech Enhancement Using Discrete Tokens and Language Models

Add code
Bookmark button
Alert button
Dec 15, 2023
Ziqian Wang, Xinfa Zhu, Zihan Zhang, YuanJun Lv, Ning Jiang, Guoqing Zhao, Lei Xie

Viaarxiv icon

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

Jan 23, 2024
W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath

Viaarxiv icon

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Masked Audio Gesture Modeling

Add code
Bookmark button
Alert button
Jan 02, 2024
Haiyang Liu, Zihao Zhu, Giorgio Becherini, Yichen Peng, Mingyang Su, You Zhou, Naoya Iwamoto, Bo Zheng, Michael J. Black

Viaarxiv icon

Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric

Feb 02, 2024
Golara Javadi, Kamer Ali Yuksel, Yunsu Kim, Thiago Castro Ferreira, Mohamed Al-Badrashiny

Viaarxiv icon

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

Dec 16, 2023
Zhaoxi Mu, Xinyu Yang, Sining Sun, Qing Yang

Viaarxiv icon

Modeling of learning curves with applications to pos tagging

Feb 04, 2024
Manuel Vilares Ferro, Victor M. Darriba Bilbao, Francisco J. Ribadas Pena

Viaarxiv icon