Picture for Eng Siong Chng

Eng Siong Chng

The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning

Add code
Mar 18, 2026
Viaarxiv icon

Privacy-Preserving End-to-End Full-Duplex Speech Dialogue Models

Add code
Mar 09, 2026
Viaarxiv icon

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Add code
Feb 24, 2026
Viaarxiv icon

The Interspeech 2026 Audio Reasoning Challenge: Evaluating Reasoning Process Quality for Audio Reasoning Models and Agents

Add code
Feb 15, 2026
Viaarxiv icon

Stream-Voice-Anon: Enhancing Utility of Real-Time Speaker Anonymization via Neural Audio Codec and Language Models

Add code
Jan 20, 2026
Viaarxiv icon

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Add code
Jan 14, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

DepFlow: Disentangled Speech Generation to Mitigate Semantic Bias in Depression Detection

Add code
Jan 01, 2026
Viaarxiv icon

GenTSE: Enhancing Target Speaker Extraction via a Coarse-to-Fine Generative Language Model

Add code
Dec 24, 2025
Viaarxiv icon

Next-Frame Feature Prediction for Multimodal Deepfake Detection and Temporal Localization

Add code
Nov 13, 2025
Viaarxiv icon