speech


Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations

Add code
Apr 25, 2026
Viaarxiv icon

Au-M-ol: A Unified Model for Medical Audio and Language Understanding

Add code
Apr 25, 2026
Viaarxiv icon

Spectro-Temporal Modulation Representation Framework for Human-Imitated Speech Detection

Add code
Apr 25, 2026
Viaarxiv icon

Measuring Temporal Linguistic Emergence in Diffusion Language Models

Add code
Apr 25, 2026
Viaarxiv icon

Inter-Stance: A Dyadic Multimodal Corpus for Conversational Stance Analysis

Add code
Apr 24, 2026
Viaarxiv icon

Identifying and typifying demographic unfairness in phoneme-level embeddings of self-supervised speech recognition models

Add code
Apr 24, 2026
Viaarxiv icon

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models

Add code
Apr 24, 2026
Viaarxiv icon

A Brain-Inspired Deep Separation Network for Single Channel Raman Spectra Unmixing

Add code
Apr 24, 2026
Viaarxiv icon

TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis

Add code
Apr 24, 2026
Viaarxiv icon

UniSonate: A Unified Model for Speech, Music, and Sound Effect Generation with Text Instructions

Add code
Apr 24, 2026
Viaarxiv icon