Alert button

"speech": models, code, and papers
Alert button

A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

Feb 01, 2024
Ahmet Yusuf Alan, Enis Karaarslan, Ömer Aydin

Viaarxiv icon

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

Jan 29, 2024
Bolei Ma, Ercong Nie, Shuzhou Yuan, Helmut Schmid, Michael Färber, Frauke Kreuter, Hinrich Schütze

Viaarxiv icon

Cascaded Cross-Modal Transformer for Audio-Textual Classification

Add code
Bookmark button
Alert button
Jan 15, 2024
Nicolae-Catalin Ristea, Andrei Anghel, Radu Tudor Ionescu

Viaarxiv icon

Attention-Guided Adaptation for Code-Switching Speech Recognition

Dec 14, 2023
Bobbi Aditya, Mahdin Rohmatillah, Liang-Hsuan Tai, Jen-Tzung Chien

Figure 1 for Attention-Guided Adaptation for Code-Switching Speech Recognition
Figure 2 for Attention-Guided Adaptation for Code-Switching Speech Recognition
Figure 3 for Attention-Guided Adaptation for Code-Switching Speech Recognition
Figure 4 for Attention-Guided Adaptation for Code-Switching Speech Recognition
Viaarxiv icon

cantnlp@LT-EDI-2024: Automatic Detection of Anti-LGBTQ+ Hate Speech in Under-resourced Languages

Jan 28, 2024
Sidney G. -J. Wong, Matthew Durward

Viaarxiv icon

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

Jan 17, 2024
Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko

Viaarxiv icon

Acoustic models of Brazilian Portuguese Speech based on Neural Transformers

Dec 14, 2023
Marcelo Matheus Gauy, Marcelo Finger

Viaarxiv icon

Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks

Dec 15, 2023
Mike Thornton, Danilo Mandic, Tobias Reichenbach

Figure 1 for Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks
Figure 2 for Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks
Figure 3 for Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks
Figure 4 for Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks
Viaarxiv icon

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Dec 22, 2023
Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, Phani Sankar Nidadavolu

Figure 1 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 2 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 3 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 4 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Viaarxiv icon

Large Language Models for Multi-Modal Human-Robot Interaction

Jan 26, 2024
Chao Wang, Stephan Hasler, Daniel Tanneberg, Felix Ocker, Frank Joublin, Antonello Ceravola, Joerg Deigmoeller, Michael Gienger

Viaarxiv icon