Speaker Identification


POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan

Add code
Mar 25, 2026
Viaarxiv icon

HELIX: Scaling Raw Audio Understanding with Hybrid Mamba-Attention Beyond the Quadratic Limit

Add code
Mar 22, 2026
Viaarxiv icon

Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework

Add code
Mar 24, 2026
Viaarxiv icon

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Add code
Mar 17, 2026
Viaarxiv icon

Quantifying Cross-Lingual Transfer in Paralinguistic Speech Tasks

Add code
Mar 09, 2026
Viaarxiv icon

Visual-Informed Speech Enhancement Using Attention-Based Beamforming

Add code
Mar 05, 2026
Viaarxiv icon

Benchmarking Speech Systems for Frontline Health Conversations: The DISPLACE-M Challenge

Add code
Mar 05, 2026
Viaarxiv icon

D-ORCA: Dialogue-Centric Optimization for Robust Audio-Visual Captioning

Add code
Feb 08, 2026
Viaarxiv icon

I can tell whether you are a Native Hawlêri Speaker! How ANN, CNN, and RNN perform in NLI-Native Language Identification

Add code
Feb 11, 2026
Viaarxiv icon

Hermes the Polyglot: A Unified Framework to Enhance Expressiveness for Multimodal Interlingual Subtitling

Add code
Jan 31, 2026
Viaarxiv icon