Picture for Eng Siong Chng

Eng Siong Chng

NTU-NPU System for Voice Privacy 2024 Challenge

Add code
Oct 03, 2024
Figure 1 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 2 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 3 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 4 for NTU-NPU System for Voice Privacy 2024 Challenge
Viaarxiv icon

Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs

Add code
Sep 24, 2024
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Add code
Jul 04, 2024
Viaarxiv icon

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

Add code
Jul 02, 2024
Figure 1 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 2 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 3 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 4 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Viaarxiv icon

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection

Add code
Jun 25, 2024
Viaarxiv icon

Towards Audio Codec-based Speech Separation

Add code
Jun 18, 2024
Viaarxiv icon

Dataset-Distillation Generative Model for Speech Emotion Recognition

Add code
Jun 05, 2024
Viaarxiv icon

Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

Add code
Jun 02, 2024
Viaarxiv icon

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Add code
May 23, 2024
Viaarxiv icon