Alert button
Picture for Tomohiro Nakatani

Tomohiro Nakatani

Alert button

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers

Feb 05, 2024
Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, Simon Doclo

Viaarxiv icon

Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss

Nov 20, 2023
Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino

Viaarxiv icon

Target Speech Extraction with Conditional Diffusion Model

Aug 17, 2023
Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani

Figure 1 for Target Speech Extraction with Conditional Diffusion Model
Figure 2 for Target Speech Extraction with Conditional Diffusion Model
Figure 3 for Target Speech Extraction with Conditional Diffusion Model
Figure 4 for Target Speech Extraction with Conditional Diffusion Model
Viaarxiv icon

Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers

Jun 29, 2023
Ning Guo, Tomohiro Nakatani, Shoko Araki, Takehiro Moriya

Figure 1 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 2 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 3 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 4 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Viaarxiv icon

NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction

Jun 22, 2023
Koki Nishida, Norihiro Takamune, Rintaro Ikeshita, Daichi Kitamura, Hiroshi Saruwatari, Tomohiro Nakatani

Figure 1 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Figure 2 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Figure 3 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Viaarxiv icon

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

May 23, 2023
Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget, Shoko Araki

Figure 1 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 2 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 3 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Viaarxiv icon

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

May 07, 2022
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

Figure 1 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 2 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 3 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 4 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Viaarxiv icon

Listen only to me! How well can target speech extraction handle false alarms?

Apr 11, 2022
Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolikova, Hiroshi Sato, Tomohiro Nakatani

Figure 1 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 2 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 3 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 4 for Listen only to me! How well can target speech extraction handle false alarms?
Viaarxiv icon

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

Mar 31, 2022
Ayako Yamamoto, Toshio Irino, Shoko Araki, Kenichi Arai, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani

Figure 1 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 2 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 3 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 4 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Viaarxiv icon

$\text{ISS}_2$: An Extension of Iterative Source Steering Algorithm for Majorization-Minimization-Based Independent Vector Analysis

Feb 02, 2022
Rintaro Ikeshita, Tomohiro Nakatani

Figure 1 for $\text{ISS}_2$: An Extension of Iterative Source Steering Algorithm for Majorization-Minimization-Based Independent Vector Analysis
Viaarxiv icon