Alert button
Picture for Yoshiki Masuyama

Yoshiki Masuyama

Alert button

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Feb 27, 2024
Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Oct 30, 2023
Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Viaarxiv icon

Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase

Jul 23, 2023
Yoshiki Masuyama, Natsuki Ueno, Nobutaka Ono

Figure 1 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 2 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 3 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Figure 4 for Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Jul 23, 2023
Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe

Figure 1 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 2 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 3 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Jul 14, 2023
Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation

Jun 17, 2023
Yoshiaki Bando, Yoshiki Masuyama, Aditya Arie Nugraha, Kazuyoshi Yoshii

Figure 1 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Figure 2 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Figure 3 for Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
Viaarxiv icon

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

Feb 15, 2023
Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono

Figure 1 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 2 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 3 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Viaarxiv icon

Online Phase Reconstruction via DNN-based Phase Differences Estimation

Nov 12, 2022
Yoshiki Masuyama, Kohei Yatabe, Kento Nagatomo, Yasuhiro Oikawa

Figure 1 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 2 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 3 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Figure 4 for Online Phase Reconstruction via DNN-based Phase Differences Estimation
Viaarxiv icon

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

Oct 19, 2022
Yoshiki Masuyama, Xuankai Chang, Samuele Cornell, Shinji Watanabe, Nobutaka Ono

Figure 1 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Viaarxiv icon