Picture for Wenwu Wang

Wenwu Wang

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Add code
May 05, 2026
Viaarxiv icon

Explainable AI in Speaker Recognition -- Making Latent Representations Understandable

Add code
Apr 25, 2026
Viaarxiv icon

Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference

Add code
Apr 14, 2026
Viaarxiv icon

FoleyDesigner: Immersive Stereo Foley Generation with Precise Spatio-Temporal Alignment for Film Clips

Add code
Apr 07, 2026
Viaarxiv icon

The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio Language Models

Add code
Mar 24, 2026
Viaarxiv icon

BioDCASE 2026 Challenge Baseline for Cross-Domain Mosquito Species Classification

Add code
Mar 20, 2026
Viaarxiv icon

RFM-Editing: Rectified Flow Matching for Text-guided Audio Editing

Add code
Sep 17, 2025
Viaarxiv icon

Teacher-Guided Pseudo Supervision and Cross-Modal Alignment for Audio-Visual Video Parsing

Add code
Sep 17, 2025
Viaarxiv icon

Region-Specific Audio Tagging for Spatial Sound

Add code
Sep 11, 2025
Viaarxiv icon

TEn-CATS: Text-Enriched Audio-Visual Video Parsing with Multi-Scale Category-Aware Temporal Graph

Add code
Sep 04, 2025
Viaarxiv icon