Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Diego Caviedes-Nozal

Speech Enhancement Based on Drifting Models

Apr 27, 2026

Liang Xu, Diego Caviedes-Nozal, Bastiaan Kleijn, Longfei Felix Yan, Rasmus Kongsgaard Olsson

Abstract:We propose Speech Enhancement based on Drifting Models (DriftSE), a novel generative framework that formulates denoising as an equilibrium problem. Rather than relying on iterative sampling, DriftSE natively achieves one-step inference by evolving the pushforward distribution of a mapping function to directly match the clean speech distribution. This evolution is driven by a Drifting Field, a learned correction vector that guides samples toward the high-density regions of the clean distribution, which naturally facilitates training on unpaired data by matching distributions rather than paired samples. We investigate the framework under two formulations: a direct mapping from the noisy observation, and a stochastic conditional generative model from a Gaussian prior. Experiments on the VoiceBank-DEMAND benchmark demonstrate that DriftSE achieves high-fidelity enhancement in a single step, outperforming multi-step diffusion baselines and establishing a new paradigm for speech enhancement.

* 6 pages, 2 figures

Via

Access Paper or Ask Questions

Room impulse response reconstruction with physics-informed deep learning

Jan 02, 2024

Xenofon Karakonstantis, Diego Caviedes-Nozal, Antoine Richard, Efren Fernandez-Grande

Figure 1 for Room impulse response reconstruction with physics-informed deep learning

Figure 2 for Room impulse response reconstruction with physics-informed deep learning

Figure 3 for Room impulse response reconstruction with physics-informed deep learning

Figure 4 for Room impulse response reconstruction with physics-informed deep learning

Abstract:A method is presented for estimating and reconstructing the sound field within a room using physics-informed neural networks. By incorporating a limited set of experimental room impulse responses as training data, this approach combines neural network processing capabilities with the underlying physics of sound propagation, as articulated by the wave equation. The network's ability to estimate particle velocity and intensity, in addition to sound pressure, demonstrates its capacity to represent the flow of acoustic energy and completely characterise the sound field with only a few measurements. Additionally, an investigation into the potential of this network as a tool for improving acoustic simulations is conducted. This is due to its profficiency in offering grid-free sound field mappings with minimal inference time. Furthermore, a study is carried out which encompasses comparative analyses against current approaches for sound field reconstruction. Specifically, the proposed approach is evaluated against both data-driven techniques and elementary wave-based regression methods. The results demonstrate that the physics-informed neural network stands out when reconstructing the early part of the room impulse response, while simultaneously allowing for complete sound field characterisation in the time domain.

* Submitted to Journal of Acoustical Society of America (JASA)

Via

Access Paper or Ask Questions