Picture for Eng Siong Chng

Eng Siong Chng

Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding

Add code
May 12, 2025
Viaarxiv icon

UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation

Add code
Mar 07, 2025
Figure 1 for UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation
Figure 2 for UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation
Viaarxiv icon

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec

Add code
Feb 22, 2025
Figure 1 for Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
Figure 2 for Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
Figure 3 for Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
Figure 4 for Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
Viaarxiv icon

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

Add code
Jan 27, 2025
Viaarxiv icon

Continual Learning with Embedding Layer Surgery and Task-wise Beam Search using Whisper

Add code
Jan 14, 2025
Viaarxiv icon

Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Add code
Jan 13, 2025
Figure 1 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 2 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 3 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 4 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Viaarxiv icon

An Investigation on the Potential of KAN in Speech Enhancement

Add code
Dec 23, 2024
Figure 1 for An Investigation on the Potential of KAN in Speech Enhancement
Figure 2 for An Investigation on the Potential of KAN in Speech Enhancement
Figure 3 for An Investigation on the Potential of KAN in Speech Enhancement
Figure 4 for An Investigation on the Potential of KAN in Speech Enhancement
Viaarxiv icon

Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities

Add code
Nov 29, 2024
Figure 1 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 2 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 3 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 4 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Viaarxiv icon

Speech Separation using Neural Audio Codecs with Embedding Loss

Add code
Nov 27, 2024
Figure 1 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 2 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 3 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 4 for Speech Separation using Neural Audio Codecs with Embedding Loss
Viaarxiv icon

NTU-NPU System for Voice Privacy 2024 Challenge

Add code
Oct 03, 2024
Figure 1 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 2 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 3 for NTU-NPU System for Voice Privacy 2024 Challenge
Figure 4 for NTU-NPU System for Voice Privacy 2024 Challenge
Viaarxiv icon