Alert button

"speech": models, code, and papers
Alert button

Scaling Up Video Summarization Pretraining with Large Language Models

Apr 04, 2024
Dawit Mureja Argaw, Seunghyun Yoon, Fabian Caba Heilbron, Hanieh Deilamsalehy, Trung Bui, Zhaowen Wang, Franck Dernoncourt, Joon Son Chung

Viaarxiv icon

Boosting keyword spotting through on-device learnable user speech characteristics

Mar 12, 2024
Cristian Cioflan, Lukas Cavigelli, Luca Benini

Figure 1 for Boosting keyword spotting through on-device learnable user speech characteristics
Figure 2 for Boosting keyword spotting through on-device learnable user speech characteristics
Figure 3 for Boosting keyword spotting through on-device learnable user speech characteristics
Figure 4 for Boosting keyword spotting through on-device learnable user speech characteristics
Viaarxiv icon

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks

Add code
Bookmark button
Alert button
Mar 08, 2024
Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

Figure 1 for Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
Figure 2 for Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
Figure 3 for Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
Figure 4 for Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
Viaarxiv icon

CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models

Add code
Bookmark button
Alert button
Apr 03, 2024
Zaid Sheikh, Antonios Anastasopoulos, Shruti Rijhwani, Lindia Tjuatja, Robbie Jimerson, Graham Neubig

Viaarxiv icon

M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews

Apr 04, 2024
Sayed Muddashir Hossain, Jan Alexandersson, Philipp Müller

Viaarxiv icon

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Apr 01, 2024
Casey Kennington, Malihe Alikhani, Heather Pon-Barry, Katherine Atwell, Yonatan Bisk, Daniel Fried, Felix Gervits, Zhao Han, Mert Inan, Michael Johnston, Raj Korpan, Diane Litman, Matthew Marge, Cynthia Matuszek, Ross Mead, Shiwali Mohan, Raymond Mooney, Natalie Parde, Jivko Sinapov, Angela Stewart, Matthew Stone, Stefanie Tellex, Tom Williams

Viaarxiv icon

Encoding of lexical tone in self-supervised models of spoken language

Apr 03, 2024
Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała

Viaarxiv icon

EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech

Mar 13, 2024
Ziqi Liang, Haoxiang Shi, Jiawei Wang, Keda Lu

Figure 1 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 2 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 3 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 4 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Viaarxiv icon

Preuve de concept d'un bot vocal dialoguant en wolof

Apr 02, 2024
Elodie Gauthier, Papa-Séga Wade, Thierry Moudenc, Patrice Collen, Emilie De Neef, Oumar Ba, Ndeye Khoyane Cama, Cheikh Ahmadou Bamba Kebe, Ndeye Aissatou Gningue, Thomas Mendo'o Aristide

Viaarxiv icon

Chinese Offensive Language Detection:Current Status and Future Directions

Mar 29, 2024
Yunze Xiao, Houda Bouamor, Wajdi Zaghouani

Viaarxiv icon