Alert button
Picture for Shusuke Takahashi

Shusuke Takahashi

Alert button

Zero- and Few-shot Sound Event Localization and Detection

Add code
Bookmark button
Alert button
Sep 17, 2023
Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara

Figure 1 for Zero- and Few-shot Sound Event Localization and Detection
Figure 2 for Zero- and Few-shot Sound Event Localization and Detection
Figure 3 for Zero- and Few-shot Sound Event Localization and Detection
Figure 4 for Zero- and Few-shot Sound Event Localization and Detection
Viaarxiv icon

The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track

Add code
Bookmark button
Alert button
Aug 14, 2023
Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

Figure 1 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 2 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 3 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Figure 4 for The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track
Viaarxiv icon

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

Add code
Bookmark button
Alert button
Jun 15, 2023
Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji

Figure 1 for STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
Figure 2 for STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
Figure 3 for STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
Figure 4 for STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
Viaarxiv icon

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

Add code
Bookmark button
Alert button
May 18, 2023
Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji

Figure 1 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 2 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 3 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 4 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Viaarxiv icon

The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation

Add code
Bookmark button
Alert button
May 13, 2023
Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation
Figure 2 for The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation
Figure 3 for The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation
Figure 4 for The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation
Viaarxiv icon

Diffusion-based Signal Refiner for Speech Separation

Add code
Bookmark button
Alert button
May 12, 2023
Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for Diffusion-based Signal Refiner for Speech Separation
Figure 2 for Diffusion-based Signal Refiner for Speech Separation
Figure 3 for Diffusion-based Signal Refiner for Speech Separation
Figure 4 for Diffusion-based Signal Refiner for Speech Separation
Viaarxiv icon

Extending Audio Masked Autoencoders Toward Audio Restoration

Add code
Bookmark button
Alert button
May 11, 2023
Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 2 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 3 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 4 for Extending Audio Masked Autoencoders Toward Audio Restoration
Viaarxiv icon

An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification

Add code
Bookmark button
Alert button
Feb 16, 2023
Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification
Figure 2 for An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification
Figure 3 for An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification
Viaarxiv icon

A Versatile Diffusion-based Generative Refiner for Speech Enhancement

Add code
Bookmark button
Alert button
Oct 27, 2022
Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for A Versatile Diffusion-based Generative Refiner for Speech Enhancement
Figure 2 for A Versatile Diffusion-based Generative Refiner for Speech Enhancement
Figure 3 for A Versatile Diffusion-based Generative Refiner for Speech Enhancement
Viaarxiv icon

DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

Add code
Bookmark button
Alert button
Oct 11, 2022
Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji

Figure 1 for DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability
Figure 2 for DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability
Figure 3 for DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability
Figure 4 for DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability
Viaarxiv icon