speech


Positional Encoding in the Context of Memristor-Based Analog Computation for Automatic Speech Recognition

Add code
Jun 11, 2026
Viaarxiv icon

Towards Personalized Federated Learning for Dysarthric Speech Recognition

Add code
Jun 11, 2026
Viaarxiv icon

Generating Training Targets for Real-World Speech Enhancement via Close-to-Distant Microphone Projection

Add code
Jun 11, 2026
Viaarxiv icon

Balancing ASR and diarization in end-to-end LLMs for multi-talker speech recognition

Add code
Jun 11, 2026
Viaarxiv icon

Predicting Cognitive Load from Speech and Interaction Dynamics in Dyadic Conversations

Add code
Jun 11, 2026
Viaarxiv icon

Self-Guidance: Enhancing Neural Codecs via Decoder Manifold Alignment

Add code
Jun 11, 2026
Viaarxiv icon

PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation

Add code
Jun 11, 2026
Viaarxiv icon

TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum

Add code
Jun 11, 2026
Viaarxiv icon

ReFree: Towards Realistic Co-Speech Video Generation via Reward-Free RL and Multilevel Speech Guidance

Add code
Jun 11, 2026
Viaarxiv icon

From Tokens to Faces: Investigating Discrete Speech Representations for 3D Facial Animation

Add code
Jun 11, 2026
Viaarxiv icon