Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

TS-RIR: Translated synthetic room impulse responses for speech augmentation

Mar 31, 2021
Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

Share this with someone who'll enjoy it:

We propose a method for improving the quality of synthetic room impulse responses generated using acoustic simulators for far-field speech recognition tasks. We bridge the gap between the synthetic room impulse responses and the real room impulse responses using our novel, one-dimensional CycleGAN architecture. We pass a synthetic room impulse response in the form of raw-waveform audio to our one-dimensional CycleGAN and translate it into a real room impulse response. We also perform sub-band room equalization to the translated room impulse response to further improve the quality of the room impulse response. We artificially create far-field speech by convolving the LibriSpeech clean speech dataset [1] with room impulse response and adding background noise. We show that far-field speech simulated with the improved room impulse response using our approach reduces the word error rate by up to 19.9% compared to the unmodified room impulse response in Kaldi LibriSpeech far-field automatic speech recognition benchmark [2].

   Access Paper Source

Share this with someone who'll enjoy it: