speech


Beyond Language: Grounding Referring Expressions with Hand Pointing in Egocentric Vision

Add code
Mar 27, 2026
Viaarxiv icon

Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation

Add code
Mar 27, 2026
Viaarxiv icon

JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems

Add code
Mar 27, 2026
Viaarxiv icon

A Power-Weighted Noncentral Complex Gaussian Distribution

Add code
Mar 27, 2026
Viaarxiv icon

HolisticSemGes: Semantic Grounding of Holistic Co-Speech Gesture Generation with Contrastive Flow-Matching

Add code
Mar 27, 2026
Viaarxiv icon

Cinematic Audio Source Separation Using Visual Cues

Add code
Mar 27, 2026
Viaarxiv icon

findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

Add code
Mar 27, 2026
Viaarxiv icon

Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR

Add code
Mar 27, 2026
Viaarxiv icon

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Add code
Mar 27, 2026
Viaarxiv icon

Analysing Calls to Order in German Parliamentary Debates

Add code
Mar 27, 2026
Viaarxiv icon