Alert button
Picture for Shinji Watanabe

Shinji Watanabe

Alert button

STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency

Add code
Bookmark button
Alert button
Apr 21, 2022
Zhong-Qiu Wang, Gordon Wichern, Shinji Watanabe, Jonathan Le Roux

Figure 1 for STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency
Figure 2 for STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency
Figure 3 for STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency
Figure 4 for STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency
Viaarxiv icon

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Add code
Bookmark button
Alert button
Apr 19, 2022
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora

Figure 1 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 2 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 3 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Figure 4 for Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Viaarxiv icon

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Add code
Bookmark button
Alert button
Apr 18, 2022
Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel Lopez-Francisco, Jonathan D. Amith, Shinji Watanabe

Figure 1 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 2 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 3 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 4 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Viaarxiv icon

Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction

Add code
Bookmark button
Alert button
Apr 15, 2022
Zhong-Qiu Wang, Shinji Watanabe

Figure 1 for Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction
Figure 2 for Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction
Figure 3 for Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction
Figure 4 for Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction
Viaarxiv icon

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Apr 01, 2022
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe

Figure 1 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Viaarxiv icon

End-to-End Multi-speaker ASR with Independent Vector Analysis

Add code
Bookmark button
Alert button
Apr 01, 2022
Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian

Figure 1 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 2 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 3 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 4 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Viaarxiv icon

Better Intermediates Improve CTC Inference

Add code
Bookmark button
Alert button
Apr 01, 2022
Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida

Figure 1 for Better Intermediates Improve CTC Inference
Figure 2 for Better Intermediates Improve CTC Inference
Figure 3 for Better Intermediates Improve CTC Inference
Viaarxiv icon

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

Add code
Bookmark button
Alert button
Mar 31, 2022
Yushi Ueda, Soumi Maiti, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu

Figure 1 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 2 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 3 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 4 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Viaarxiv icon

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy

Add code
Bookmark button
Alert button
Mar 31, 2022
Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin

Figure 1 for SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Figure 2 for SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Figure 3 for SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Figure 4 for SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Viaarxiv icon