Picture for Tomohiro Nakatani

Tomohiro Nakatani

MOVER: Combining Multiple Meeting Recognition Systems

Add code
Aug 07, 2025
Viaarxiv icon

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

Add code
Jun 12, 2025
Viaarxiv icon

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Add code
Jan 13, 2025
Figure 1 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 2 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 3 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 4 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Interaural time difference loss for binaural target sound extraction

Add code
Aug 01, 2024
Figure 1 for Interaural time difference loss for binaural target sound extraction
Figure 2 for Interaural time difference loss for binaural target sound extraction
Viaarxiv icon

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers

Add code
Feb 05, 2024
Viaarxiv icon

Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss

Add code
Nov 20, 2023
Figure 1 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Figure 2 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Figure 3 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Viaarxiv icon

Target Speech Extraction with Conditional Diffusion Model

Add code
Aug 17, 2023
Figure 1 for Target Speech Extraction with Conditional Diffusion Model
Figure 2 for Target Speech Extraction with Conditional Diffusion Model
Figure 3 for Target Speech Extraction with Conditional Diffusion Model
Figure 4 for Target Speech Extraction with Conditional Diffusion Model
Viaarxiv icon

Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers

Add code
Jun 29, 2023
Figure 1 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 2 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 3 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 4 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Viaarxiv icon

NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction

Add code
Jun 22, 2023
Figure 1 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Figure 2 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Figure 3 for NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction
Viaarxiv icon