Picture for Yoshiki Masuyama

Yoshiki Masuyama

Exploring the Capability of Mamba in Speech Applications

Add code
Jun 24, 2024
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Viaarxiv icon

Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase

Add code
Jul 23, 2023
Figure 1 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 2 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 3 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 4 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Add code
Jul 23, 2023
Figure 1 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 2 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 3 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Add code
Jul 14, 2023
Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation

Add code
Jun 17, 2023
Figure 1 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Figure 2 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Figure 3 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Viaarxiv icon

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

Add code
Feb 15, 2023
Figure 1 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 2 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 3 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Viaarxiv icon

Online Phase Reconstruction via DNN-based Phase Differences Estimation

Add code
Nov 12, 2022
Figure 1 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 2 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 3 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 4 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Viaarxiv icon

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

Add code
Oct 19, 2022
Figure 1 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Viaarxiv icon