Picture for Jia Qi Yip

Jia Qi Yip

Bona fide Cross Testing Reveals Weak Spot in Audio Deepfake Detection Systems

Add code
Sep 11, 2025
Viaarxiv icon

Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function

Add code
Sep 11, 2025
Viaarxiv icon

Speechless: Speech Instruction Training Without Speech for Low Resource Languages

Add code
May 23, 2025
Viaarxiv icon

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec

Add code
Feb 22, 2025
Viaarxiv icon

Continual Learning with Embedding Layer Surgery and Task-wise Beam Search using Whisper

Add code
Jan 14, 2025
Viaarxiv icon

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music

Add code
Dec 23, 2024
Viaarxiv icon

Speech Separation using Neural Audio Codecs with Embedding Loss

Add code
Nov 27, 2024
Figure 1 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 2 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 3 for Speech Separation using Neural Audio Codecs with Embedding Loss
Figure 4 for Speech Separation using Neural Audio Codecs with Embedding Loss
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions

Add code
Sep 25, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon