speech


The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset

Add code
Jun 11, 2025
Viaarxiv icon

FedMLAC: Mutual Learning Driven Heterogeneous Federated Audio Classification

Add code
Jun 11, 2025
Viaarxiv icon

A Study on Speech Assessment with Visual Cues

Add code
Jun 11, 2025
Viaarxiv icon

OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary

Add code
Jun 11, 2025
Viaarxiv icon

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Add code
Jun 11, 2025
Viaarxiv icon

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction

Add code
Jun 11, 2025
Viaarxiv icon

Recognizing Every Voice: Towards Inclusive ASR for Rural Bhojpuri Women

Add code
Jun 11, 2025
Viaarxiv icon

Improved in-car sound pick-up using multichannel Wiener filter

Add code
Jun 11, 2025
Viaarxiv icon

Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval

Add code
Jun 11, 2025
Viaarxiv icon

ToxSyn-PT: A Large-Scale Synthetic Dataset for Hate Speech Detection in Portuguese

Add code
Jun 11, 2025
Viaarxiv icon