Picture for Erik Visser

Erik Visser

Spatial Audio Question Answering and Reasoning on Dynamic Source Movements

Add code
Feb 18, 2026
Viaarxiv icon

Proactive Conversational Assistant for a Procedural Manual Task based on Audio and IMU

Add code
Feb 17, 2026
Viaarxiv icon

LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio

Add code
Feb 16, 2026
Viaarxiv icon

Mitigating Intra-Speaker Variability in Diarization with Style-Controllable Speech Augmentation

Add code
Sep 18, 2025
Viaarxiv icon

Aligning Audio Captions with Human Preferences

Add code
Sep 18, 2025
Viaarxiv icon

Spatial Audio Motion Understanding and Reasoning

Add code
Sep 18, 2025
Viaarxiv icon

Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework

Add code
May 21, 2025
Figure 1 for Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
Figure 2 for Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
Figure 3 for Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
Figure 4 for Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
Viaarxiv icon

Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding

Add code
Dec 05, 2024
Figure 1 for Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding
Figure 2 for Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding
Figure 3 for Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding
Figure 4 for Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding
Viaarxiv icon

Confidence Calibration for Audio Captioning Models

Add code
Sep 13, 2024
Figure 1 for Confidence Calibration for Audio Captioning Models
Figure 2 for Confidence Calibration for Audio Captioning Models
Figure 3 for Confidence Calibration for Audio Captioning Models
Figure 4 for Confidence Calibration for Audio Captioning Models
Viaarxiv icon

Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language Models

Add code
Sep 10, 2024
Viaarxiv icon