Picture for Eng Siong Chng

Eng Siong Chng

Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Add code
Jul 04, 2024
Viaarxiv icon

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

Add code
Jul 02, 2024
Figure 1 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 2 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 3 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 4 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Viaarxiv icon

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection

Add code
Jun 25, 2024
Figure 1 for Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Figure 2 for Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Figure 3 for Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Figure 4 for Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Viaarxiv icon

Towards Audio Codec-based Speech Separation

Add code
Jun 18, 2024
Viaarxiv icon

Dataset-Distillation Generative Model for Speech Emotion Recognition

Add code
Jun 05, 2024
Figure 1 for Dataset-Distillation Generative Model for Speech Emotion Recognition
Figure 2 for Dataset-Distillation Generative Model for Speech Emotion Recognition
Figure 3 for Dataset-Distillation Generative Model for Speech Emotion Recognition
Figure 4 for Dataset-Distillation Generative Model for Speech Emotion Recognition
Viaarxiv icon

Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

Add code
Jun 02, 2024
Viaarxiv icon

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Add code
May 23, 2024
Figure 1 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 2 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 3 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 4 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Viaarxiv icon

Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

Add code
May 16, 2024
Figure 1 for Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Figure 2 for Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Figure 3 for Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Figure 4 for Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Viaarxiv icon

Aligning Speech to Languages to Enhance Code-switching Speech Recognition

Add code
Mar 09, 2024
Figure 1 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 2 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 3 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 4 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Viaarxiv icon

Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Add code
Feb 16, 2024
Viaarxiv icon