Picture for Zhaoheng Ni

Zhaoheng Ni

Unified Diffusion Refinement for Multi-Channel Speech Enhancement and Separation

Add code
Mar 25, 2026
Viaarxiv icon

ArrayDPS-Refine: Generative Refinement of Discriminative Multi-Channel Speech Enhancement

Add code
Mar 25, 2026
Viaarxiv icon

ICASSP 2026 URGENT Speech Enhancement Challenge

Add code
Jan 20, 2026
Viaarxiv icon

SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training

Add code
Jan 18, 2026
Viaarxiv icon

Interspeech 2025 URGENT Speech Enhancement Challenge

Add code
May 29, 2025
Viaarxiv icon

Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding

Add code
Dec 24, 2024
Figure 1 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Figure 2 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Figure 3 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Viaarxiv icon

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition

Add code
Sep 01, 2024
Figure 1 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 2 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 3 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 4 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Viaarxiv icon

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Add code
Jul 04, 2024
Figure 1 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 2 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 3 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 4 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Viaarxiv icon

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Figure 1 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 2 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 3 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 4 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Viaarxiv icon

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Add code
Jan 18, 2024
Figure 1 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 2 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 3 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 4 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Viaarxiv icon