Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Jonathan Le Roux

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks


Oct 19, 2021
Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux

* Submitted to ICASSP2022. For resources and examples, see https://cocktail-fork.github.io 

  Access Paper or Ask Questions

Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning


Oct 13, 2021
Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori

* https://dstc10.dstc.community/home and https://github.com/dialogtekgeek/AVSD-DSTC10_Official/ 

  Access Paper or Ask Questions

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy


Oct 11, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

* Submitted to ICASSP2022 

  Access Paper or Ask Questions

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement


Oct 01, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

* in submission 

  Access Paper or Ask Questions

Visual Scene Graphs for Audio Source Separation


Sep 24, 2021
Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian

* Accepted at ICCV 2021 

  Access Paper or Ask Questions

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation


Aug 16, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

* 16 pages, 4 figures 

  Access Paper or Ask Questions

Convolutive Prediction for Reverberant Speech Separation


Aug 16, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

* in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021 

  Access Paper or Ask Questions

On The Compensation Between Magnitude and Phase in Speech Separation


Aug 11, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

* in submission to IEEE Signal Processing Letters 

  Access Paper or Ask Questions

Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers


Aug 04, 2021
Chiori Hori, Takaaki Hori, Jonathan Le Roux

* Interspeech 2021 accepted 

  Access Paper or Ask Questions

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition


Jul 02, 2021
Niko Moritz, Takaaki Hori, Jonathan Le Roux

* Accepted to Interspeech 2021 

  Access Paper or Ask Questions

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition


Jun 16, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

* Accepted to Interspeech 2021 

  Access Paper or Ask Questions

Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers


Apr 19, 2021
Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Capturing Multi-Resolution Context by Dilated Self-Attention


Apr 07, 2021
Niko Moritz, Takaaki Hori, Jonathan Le Roux

* In Proc. ICASSP 2021 

  Access Paper or Ask Questions

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training


Nov 26, 2020
Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux

* Submitted to ICASSP 2021 

  Access Paper or Ask Questions

Semi-Supervised Speech Recognition via Graph-based Temporal Classification


Oct 29, 2020
Niko Moritz, Takaaki Hori, Jonathan Le Roux

* Submitted to ICASSP 2021 

  Access Paper or Ask Questions

Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision


Oct 22, 2020
Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux


  Access Paper or Ask Questions

Multi-Pass Transformer for Machine Translation


Sep 23, 2020
Peng Gao, Chiori Hori, Shijie Geng, Takaaki Hori, Jonathan Le Roux

* 10 pages, 5 figures and 2 tables 

  Access Paper or Ask Questions

AutoClip: Adaptive Gradient Clipping for Source Separation Networks


Jul 25, 2020
Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux

* Accepted at 2020 IEEE International Workshop on Machine Learning for Signal Processing, Sept.\ 21--24, 2020, Espoo, Finland 

  Access Paper or Ask Questions

Spatio-Temporal Scene Graphs for Video Dialog


Jul 08, 2020
Shijie Geng, Peng Gao, Chiori Hori, Jonathan Le Roux, Anoop Cherian


  Access Paper or Ask Questions

Detecting Audio Attacks on ASR Systems with Dropout Uncertainty


Jun 02, 2020
Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin


  Access Paper or Ask Questions

Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR


Feb 14, 2020
Leda Sarı, Niko Moritz, Takaaki Hori, Jonathan Le Roux

* To appear in Proc. ICASSP 2020 

  Access Paper or Ask Questions

End-to-End Multi-speaker Speech Recognition with Transformer


Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

* To appear in ICASSP 2020 

  Access Paper or Ask Questions

Streaming automatic speech recognition with the transformer model


Jan 09, 2020
Niko Moritz, Takaaki Hori, Jonathan Le Roux


  Access Paper or Ask Questions

Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision


Nov 06, 2019
Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux


  Access Paper or Ask Questions

Bootstrapping deep music separation from primitive auditory grouping principles


Oct 23, 2019
Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo


  Access Paper or Ask Questions

MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition


Oct 16, 2019
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

* Accepted at ASRU 2019 

  Access Paper or Ask Questions

Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity


Sep 18, 2019
Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux

* Accepted for publication at WASPAA 2019 

  Access Paper or Ask Questions

WHAM!: Extending Speech Separation to Noisy Environments


Jul 02, 2019
Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux

* Accepted for publication at Interspeech 2019 

  Access Paper or Ask Questions

Universal Sound Separation


May 08, 2019
Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin Wilson, Jonathan Le Roux, John R. Hershey

* 5 pages, submitted to WASPAA 2019 

  Access Paper or Ask Questions