Picture for Wenwu Wang

Wenwu Wang

Explainable AI in Speaker Recognition -- Attention Map Visualisation and Evaluation

Add code
Jun 22, 2026
Viaarxiv icon

Hybrid Diffusion Transformer for Instruction-Guided Audio Editing via Rectified Flow

Add code
Jun 18, 2026
Viaarxiv icon

COMET: Concept Space Dissection of the Modality Gap in Audio-Text Multimodal Contrastive Embeddings

Add code
May 28, 2026
Viaarxiv icon

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Add code
May 05, 2026
Viaarxiv icon

Explainable AI in Speaker Recognition -- Making Latent Representations Understandable

Add code
Apr 25, 2026
Viaarxiv icon

Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference

Add code
Apr 14, 2026
Viaarxiv icon

FoleyDesigner: Immersive Stereo Foley Generation with Precise Spatio-Temporal Alignment for Film Clips

Add code
Apr 07, 2026
Viaarxiv icon

The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio Language Models

Add code
Mar 24, 2026
Viaarxiv icon

BioDCASE 2026 Challenge Baseline for Cross-Domain Mosquito Species Classification

Add code
Mar 20, 2026
Viaarxiv icon

RFM-Editing: Rectified Flow Matching for Text-guided Audio Editing

Add code
Sep 17, 2025
Viaarxiv icon