Alert button

"speech": models, code, and papers
Alert button

SpiRit-LM: Interleaved Spoken and Written Language Model

Feb 08, 2024
Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-jussa, Maha Elbayad, Sravya Popuri, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux

Viaarxiv icon

Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection

Feb 19, 2024
Xiaohui Zhang, Wenjie Fu, Mangui Liang

Viaarxiv icon

A Phoneme-Scale Assessment of Multichannel Speech Enhancement Algorithms

Jan 24, 2024
Nasser-Eddine Monir, Paul Magron, Romain Serizel

Viaarxiv icon

Exploring the Adversarial Capabilities of Large Language Models

Feb 15, 2024
Lukas Struppek, Minh Hieu Le, Dominik Hintersdorf, Kristian Kersting

Viaarxiv icon

HINT: High-quality INPainting Transformer with Mask-Aware Encoding and Enhanced Attention

Feb 22, 2024
Shuang Chen, Amir Atapour-Abarghouei, Hubert P. H. Shum

Viaarxiv icon

Contrastive Learning of Shared Spatiotemporal EEG Representations Across Individuals for Naturalistic Neuroscience

Feb 22, 2024
Xinke Shen, Lingyi Tao, Xuyang Chen, Sen Song, Quanying Liu, Dan Zhang

Viaarxiv icon

MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms

Feb 21, 2024
Yiqiao Jin, Minje Choi, Gaurav Verma, Jindong Wang, Srijan Kumar

Viaarxiv icon

Backdoor Attacks on Dense Passage Retrievers for Disseminating Misinformation

Feb 21, 2024
Quanyu Long, Yue Deng, LeiLei Gan, Wenya Wang, Sinno Jialin Pan

Viaarxiv icon

The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese

Feb 12, 2024
Ajinkya Kulkarni, Anna Tokareva, Rameez Qureshi, Miguel Couceiro

Viaarxiv icon

Robust Dual-Modal Speech Keyword Spotting for XR Headsets

Jan 26, 2024
Zhuojiang Cai, Yuhan Ma, Feng Lu

Viaarxiv icon