Alert button
Picture for Tomohiro Nakatani

Tomohiro Nakatani

Alert button

Multimodal Attention Fusion for Target Speaker Extraction

Add code
Bookmark button
Alert button
Feb 02, 2021
Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

Figure 1 for Multimodal Attention Fusion for Target Speaker Extraction
Figure 2 for Multimodal Attention Fusion for Target Speaker Extraction
Figure 3 for Multimodal Attention Fusion for Target Speaker Extraction
Figure 4 for Multimodal Attention Fusion for Target Speaker Extraction
Viaarxiv icon

A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter

Add code
Bookmark button
Alert button
Jan 21, 2021
Nobutaka Ito, Rintaro Ikeshita, Hiroshi Sawada, Tomohiro Nakatani

Figure 1 for A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter
Figure 2 for A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter
Figure 3 for A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter
Figure 4 for A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter
Viaarxiv icon

Speaker activity driven neural speech extraction

Add code
Bookmark button
Alert button
Jan 14, 2021
Marc Delcroix, Katerina Zmolikova, Tsubasa Ochiai, Keisuke Kinoshita, Tomohiro Nakatani

Figure 1 for Speaker activity driven neural speech extraction
Figure 2 for Speaker activity driven neural speech extraction
Figure 3 for Speaker activity driven neural speech extraction
Viaarxiv icon

Neural Network-based Virtual Microphone Estimator

Add code
Bookmark button
Alert button
Jan 12, 2021
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki

Figure 1 for Neural Network-based Virtual Microphone Estimator
Figure 2 for Neural Network-based Virtual Microphone Estimator
Figure 3 for Neural Network-based Virtual Microphone Estimator
Figure 4 for Neural Network-based Virtual Microphone Estimator
Viaarxiv icon

Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR

Add code
Bookmark button
Alert button
Jun 04, 2020
Thilo von Neumann, Christoph Boeddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Figure 1 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 2 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 3 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 4 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Viaarxiv icon

Improving noise robust automatic speech recognition with single-channel time-domain enhancement network

Add code
Bookmark button
Alert button
Mar 09, 2020
Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani

Figure 1 for Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Figure 2 for Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Figure 3 for Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Figure 4 for Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Viaarxiv icon

Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

Add code
Bookmark button
Alert button
Mar 09, 2020
Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani

Figure 1 for Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system
Figure 2 for Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system
Figure 3 for Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system
Figure 4 for Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system
Viaarxiv icon

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

Add code
Bookmark button
Alert button
Jan 23, 2020
Marc Delcroix, Tsubasa Ochiai, Katerina Zmolikova, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki

Figure 1 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 2 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 3 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 4 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Viaarxiv icon

End-to-end training of time domain audio separation and recognition

Add code
Bookmark button
Alert button
Dec 25, 2019
Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Figure 1 for End-to-end training of time domain audio separation and recognition
Figure 2 for End-to-end training of time domain audio separation and recognition
Figure 3 for End-to-end training of time domain audio separation and recognition
Figure 4 for End-to-end training of time domain audio separation and recognition
Viaarxiv icon

Ene-to-end training of time domain audio separation and recognition

Add code
Bookmark button
Alert button
Dec 18, 2019
Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Figure 1 for Ene-to-end training of time domain audio separation and recognition
Figure 2 for Ene-to-end training of time domain audio separation and recognition
Figure 3 for Ene-to-end training of time domain audio separation and recognition
Figure 4 for Ene-to-end training of time domain audio separation and recognition
Viaarxiv icon