Picture for Eng Siong Chng

Eng Siong Chng

EASY: Emotion-aware Speaker Anonymization via Factorized Distillation

Add code
May 21, 2025
Viaarxiv icon

Distilling a speech and music encoder with task arithmetic

Add code
May 19, 2025
Viaarxiv icon

Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding

Add code
May 12, 2025
Viaarxiv icon

UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation

Add code
Mar 07, 2025
Viaarxiv icon

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec

Add code
Feb 22, 2025
Viaarxiv icon

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

Add code
Jan 27, 2025
Viaarxiv icon

Continual Learning with Embedding Layer Surgery and Task-wise Beam Search using Whisper

Add code
Jan 14, 2025
Viaarxiv icon

Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Add code
Jan 13, 2025
Figure 1 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 2 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 3 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 4 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Viaarxiv icon

An Investigation on the Potential of KAN in Speech Enhancement

Add code
Dec 23, 2024
Viaarxiv icon

Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities

Add code
Nov 29, 2024
Figure 1 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 2 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 3 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 4 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Viaarxiv icon