Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FlowW2N: Whispered-to-Normal Speech Conversion via Flow-Matching

Mar 04, 2026

Fabian Ritter-Gutierrez, Md Asif Jalal, Pablo Peso Parada, Karthikeyan Saravanan, Yusun Shul, Minseung Kim, Gun-Woo Lee, Han-Gil Moon

Share this with someone who'll enjoy it:

Abstract:Whispered-to-normal (W2N) speech conversion aims to reconstruct missing phonation from whispered input while preserving content and speaker identity. This task is challenging due to temporal misalignment between whisper and voiced recordings and lack of paired data. We propose FlowW2N, a conditional flow matching approach that trains exclusively on synthetic, time-aligned whisper-normal pairs and conditions on domain-invariant features. We exploit high-level ASR embeddings that exhibits strong invariance between synthetic and real whispered speech, enabling generalization to real whispers despite never observing it during training. We verify this invariance across ASR layers and propose a selection criterion optimizing content informativeness and cross-domain invariance. Our method achieves SOTA intelligibility on the CHAINS and wTIMIT datasets, reducing Word Error Rate by 26-46% relative to prior work while using only 10 steps at inference and requiring no real paired data.

* Submitted to Interspeech 2026

View paper on

Share this with someone who'll enjoy it:

Title:FlowW2N: Whispered-to-Normal Speech Conversion via Flow-Matching

Paper and Code