Alert button
Picture for Shengkui Zhao

Shengkui Zhao

Alert button

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation

Add code
Bookmark button
Alert button
Dec 19, 2023
Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Jiaqi Yip, Dianwen Ng, Bin Ma

Viaarxiv icon

SPGM: Prioritizing Local Features for enhanced speech separation performance

Add code
Bookmark button
Alert button
Sep 22, 2023
Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma

Figure 1 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 2 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 3 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 4 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Viaarxiv icon

Are Soft Prompts Good Zero-shot Learners for Speech Recognition?

Add code
Bookmark button
Alert button
Sep 18, 2023
Dianwen Ng, Chong Zhang, Ruixi Zhang, Yukun Ma, Fabian Ritter-Gutierrez, Trung Hieu Nguyen, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma

Figure 1 for Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Figure 2 for Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Figure 3 for Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Figure 4 for Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Viaarxiv icon

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention

Add code
Bookmark button
Alert button
May 20, 2023
Jia Qi Yip, Tuan Truong, Dianwen Ng, Chong Zhang, Yukun Ma, Trung Hieu Nguyen, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma

Figure 1 for ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
Figure 2 for ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
Figure 3 for ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
Figure 4 for ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
Viaarxiv icon

D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement

Add code
Bookmark button
Alert button
Feb 23, 2023
Shengkui Zhao, Bin Ma

Figure 1 for D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Figure 2 for D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Figure 3 for D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Figure 4 for D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Viaarxiv icon

MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions

Add code
Bookmark button
Alert button
Feb 23, 2023
Shengkui Zhao, Bin Ma

Figure 1 for MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Figure 2 for MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Figure 3 for MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Figure 4 for MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Viaarxiv icon

FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement

Add code
Bookmark button
Alert button
Jun 15, 2022
Shengkui Zhao, Bin Ma, Karn N. Watcharasupat, Woon-Seng Gan

Figure 1 for FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
Figure 2 for FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
Figure 3 for FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
Figure 4 for FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
Viaarxiv icon

End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression

Add code
Bookmark button
Alert button
Oct 11, 2021
Karn N. Watcharasupat, Thi Ngoc Tho Nguyen, Woon-Seng Gan, Shengkui Zhao, Bin Ma

Figure 1 for End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression
Figure 2 for End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression
Viaarxiv icon

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Add code
Bookmark button
Alert button
Feb 03, 2021
Shengkui Zhao, Trung Hieu Nguyen, Bin Ma

Figure 1 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 2 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 3 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 4 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Viaarxiv icon

Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

Add code
Bookmark button
Alert button
Feb 03, 2021
Shengkui Zhao, Hao Wang, Trung Hieu Nguyen, Bin Ma

Figure 1 for Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Figure 2 for Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Figure 3 for Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Figure 4 for Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Viaarxiv icon