Alert button
Picture for Jonathan Le Roux

Jonathan Le Roux

Alert button

MERL

Why does music source separation benefit from cacophony?

Feb 28, 2024
Chang-Bin Jeon, Gordon Wichern, François G. Germain, Jonathan Le Roux

Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Feb 27, 2024
Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Viaarxiv icon

GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model

Feb 09, 2024
Haocheng Liu, Teysir Baoueb, Mathieu Fontaine, Jonathan Le Roux, Gael Richard

Viaarxiv icon

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis

Jan 30, 2024
Teysir Baoueb, Haocheng Liu, Mathieu Fontaine, Jonathan Le Roux, Gael Richard

Viaarxiv icon

NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection

Dec 12, 2023
Zexu Pan, Gordon Wichern, Francois G. Germain, Sameer Khurana, Jonathan Le Roux

Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Oct 30, 2023
Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Viaarxiv icon

Generation or Replication: Auscultating Audio Latent Diffusion Models

Oct 16, 2023
Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Viaarxiv icon

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation

Sep 29, 2023
Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François Germain, Jonathan Le Roux, Shinji Watanabe

Figure 1 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 2 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 3 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 4 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Viaarxiv icon

The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track

Aug 14, 2023
Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

Figure 1 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 2 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 3 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 4 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Viaarxiv icon

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

Jun 27, 2023
Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh Jha, Diego Romeres, Jonathan Le Roux

Figure 1 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 2 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 3 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 4 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Viaarxiv icon