Picture for Romain Serizel

Romain Serizel

MULTISPEECH

Posterior Transition Modeling for Unsupervised Diffusion-Based Speech Enhancement

Add code
Jul 03, 2025
Viaarxiv icon

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

Add code
Jun 12, 2025
Viaarxiv icon

Tracking of Intermittent and Moving Speakers : Dataset and Metrics

Add code
Jun 11, 2025
Viaarxiv icon

Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models

Add code
May 12, 2025
Viaarxiv icon

Angular Distance Distribution Loss for Audio Classification

Add code
Oct 31, 2024
Viaarxiv icon

A decade of DCASE: Achievements, practices, evaluations and future challenges

Add code
Oct 07, 2024
Viaarxiv icon

Diffusion-based Unsupervised Audio-visual Speech Enhancement

Add code
Oct 04, 2024
Viaarxiv icon

Domain-Invariant Representation Learning of Bird Sounds

Add code
Sep 16, 2024
Figure 1 for Domain-Invariant Representation Learning of Bird Sounds
Figure 2 for Domain-Invariant Representation Learning of Bird Sounds
Viaarxiv icon

Energy Consumption Trends in Sound Event Detection Systems

Add code
Sep 13, 2024
Viaarxiv icon

Normalizing Energy Consumption for Hardware-Independent Evaluation

Add code
Sep 09, 2024
Viaarxiv icon