Picture for Wenwu Wang

Wenwu Wang

The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio Language Models

Add code
Mar 24, 2026
Viaarxiv icon

BioDCASE 2026 Challenge Baseline for Cross-Domain Mosquito Species Classification

Add code
Mar 20, 2026
Viaarxiv icon

Teacher-Guided Pseudo Supervision and Cross-Modal Alignment for Audio-Visual Video Parsing

Add code
Sep 17, 2025
Viaarxiv icon

RFM-Editing: Rectified Flow Matching for Text-guided Audio Editing

Add code
Sep 17, 2025
Viaarxiv icon

Region-Specific Audio Tagging for Spatial Sound

Add code
Sep 11, 2025
Viaarxiv icon

TEn-CATS: Text-Enriched Audio-Visual Video Parsing with Multi-Scale Category-Aware Temporal Graph

Add code
Sep 04, 2025
Viaarxiv icon

AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion

Add code
May 28, 2025
Figure 1 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 2 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 3 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 4 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Viaarxiv icon

EnvSDD: Benchmarking Environmental Sound Deepfake Detection

Add code
May 25, 2025
Viaarxiv icon

From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems

Add code
Apr 30, 2025
Viaarxiv icon

Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows

Add code
Apr 22, 2025
Figure 1 for Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows
Figure 2 for Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows
Figure 3 for Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows
Figure 4 for Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows
Viaarxiv icon