Alert button
Picture for Jonathan Le Roux

Jonathan Le Roux

Alert button

Heterogeneous Target Speech Separation

Apr 07, 2022
Efthymios Tzinis, Gordon Wichern, Aswin Subramanian, Paris Smaragdis, Jonathan Le Roux

Figure 1 for Heterogeneous Target Speech Separation
Figure 2 for Heterogeneous Target Speech Separation
Figure 3 for Heterogeneous Target Speech Separation
Figure 4 for Heterogeneous Target Speech Separation
Viaarxiv icon

Locate This, Not That: Class-Conditioned Sound Event DOA Estimation

Mar 08, 2022
Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux

Figure 1 for Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Figure 2 for Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Figure 3 for Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Figure 4 for Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Viaarxiv icon

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR

Mar 01, 2022
Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux

Figure 1 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 2 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 3 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 4 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Viaarxiv icon

(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering

Feb 18, 2022
Anoop Cherian, Chiori Hori, Tim K. Marks, Jonathan Le Roux

Figure 1 for (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 2 for (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 3 for (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 4 for (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering
Viaarxiv icon

Sequence Transduction with Graph-based Supervision

Nov 01, 2021
Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux

Figure 1 for Sequence Transduction with Graph-based Supervision
Figure 2 for Sequence Transduction with Graph-based Supervision
Figure 3 for Sequence Transduction with Graph-based Supervision
Viaarxiv icon

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks

Oct 19, 2021
Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux

Figure 1 for The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Figure 2 for The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Figure 3 for The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Figure 4 for The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Viaarxiv icon

Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning

Oct 13, 2021
Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori

Figure 1 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 2 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 3 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 4 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Viaarxiv icon

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy

Oct 11, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

Figure 1 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 2 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 3 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Viaarxiv icon

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement

Oct 01, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

Figure 1 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 2 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 3 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 4 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Viaarxiv icon

Visual Scene Graphs for Audio Source Separation

Sep 24, 2021
Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian

Figure 1 for Visual Scene Graphs for Audio Source Separation
Figure 2 for Visual Scene Graphs for Audio Source Separation
Figure 3 for Visual Scene Graphs for Audio Source Separation
Figure 4 for Visual Scene Graphs for Audio Source Separation
Viaarxiv icon