speech


Is Text All You Need? Text as a Universal Information Bottleneck for Speech LLMs

Add code
Jun 08, 2026
Viaarxiv icon

Rethinking Depth: A study of the Recursive-Transformer for Speech Recognition

Add code
Jun 08, 2026
Viaarxiv icon

A study on the impact of region specific data on the performance of Indic ASR

Add code
Jun 08, 2026
Viaarxiv icon

Factors affecting ASR performance: A study using state of the art ASR models in Indic Languages

Add code
Jun 08, 2026
Viaarxiv icon

A Comparative Study of Pre-trained Speech Encoders and Training Objectives for Large-Scale Indic Spoken Language Identification

Add code
Jun 08, 2026
Viaarxiv icon

NüshuVoice: Reviving the Voice of Endangered Nüshu with Pitch-Aware Text-to-Speech

Add code
Jun 08, 2026
Viaarxiv icon

Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention

Add code
Jun 08, 2026
Viaarxiv icon

End-to-End Training for Discrete Token LLM based TTS System

Add code
Jun 08, 2026
Viaarxiv icon

HoliDubber: Holistic Video Dubbing for Complex Acoustic Scenes via Text-Guided Audio Synthesis

Add code
Jun 08, 2026
Viaarxiv icon

BareWave: Waveform-Native Flow-Matching Text-to-Speech

Add code
Jun 08, 2026
Viaarxiv icon