speech


Beyond Descriptions: A Generative Scene2Audio Framework for Blind and Low-Vision Users to Experience Vista Landscapes

Add code
Mar 28, 2026
Viaarxiv icon

SCOPE: Tree-based Self-Correcting Online Log Parsing via Syntactic-Semantic Collaboration

Add code
Mar 28, 2026
Viaarxiv icon

Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language

Add code
Mar 27, 2026
Viaarxiv icon

Introducing MELI: the Mandarin-English Language Interview Corpus

Add code
Mar 27, 2026
Viaarxiv icon

Beyond Language: Grounding Referring Expressions with Hand Pointing in Egocentric Vision

Add code
Mar 27, 2026
Viaarxiv icon

A Power-Weighted Noncentral Complex Gaussian Distribution

Add code
Mar 27, 2026
Viaarxiv icon

HolisticSemGes: Semantic Grounding of Holistic Co-Speech Gesture Generation with Contrastive Flow-Matching

Add code
Mar 27, 2026
Viaarxiv icon

Cinematic Audio Source Separation Using Visual Cues

Add code
Mar 27, 2026
Viaarxiv icon

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Add code
Mar 27, 2026
Viaarxiv icon

Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation

Add code
Mar 27, 2026
Viaarxiv icon