speech


DiBA: Diagonal and Binary Matrix Approximation for Neural Network Weight Compression

Add code
May 07, 2026
Viaarxiv icon

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Add code
May 07, 2026
Viaarxiv icon

The Pinocchio Dimension: Phenomenality of Experience as the Primary Axis of LLM Psychometric Differences

Add code
May 06, 2026
Viaarxiv icon

JASTIN: Aligning LLMs for Zero-Shot Audio and Speech Evaluation via Natural Language Instructions

Add code
May 06, 2026
Viaarxiv icon

A Comparative Study of PyCaret AutoML and CNN-BiLSTM for Binary Hate Speech Detection in Indonesian Twitter

Add code
May 06, 2026
Viaarxiv icon

Spatial-Magnifier: Spatial upsampling for multichannel speech enhancement

Add code
May 06, 2026
Viaarxiv icon

TajikNLP: An Open-Source Toolkit for Comprehensive Text Processing of Tajik (Cyrillic Script)

Add code
May 06, 2026
Viaarxiv icon

Benchmarking POS Tagging for the Tajik Language: A Comparative Study of Neural Architectures on the TajPersParallel Corpus

Add code
May 06, 2026
Viaarxiv icon

Audio-Visual Intelligence in Large Foundation Models

Add code
May 05, 2026
Viaarxiv icon

MiniMind-O Technical Report: An Open Small-Scale Speech-Native Omni Model

Add code
May 05, 2026
Viaarxiv icon