Picture for Zheng-Hua Tan

Zheng-Hua Tan

Aalborg University

Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enhancement

Add code
Oct 02, 2025
Viaarxiv icon

DSpAST: Disentangled Representations for Spatial Audio Reasoning with Large Language Models

Add code
Sep 17, 2025
Viaarxiv icon

Learning Robust Spatial Representations from Binaural Audio through Feature Distillation

Add code
Aug 28, 2025
Viaarxiv icon

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement

Add code
Jul 01, 2025
Figure 1 for MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
Figure 2 for MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
Figure 3 for MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
Figure 4 for MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
Viaarxiv icon

A Survey of Deep Learning for Complex Speech Spectrograms

Add code
May 13, 2025
Viaarxiv icon

Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work

Add code
Mar 13, 2025
Figure 1 for Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work
Figure 2 for Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work
Figure 3 for Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work
Viaarxiv icon

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Add code
Jan 10, 2025
Figure 1 for xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement
Figure 2 for xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement
Figure 3 for xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement
Figure 4 for xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement
Viaarxiv icon

Vocal Tract Length Warped Features for Spoken Keyword Spotting

Add code
Jan 07, 2025
Viaarxiv icon

Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining

Add code
Jan 06, 2025
Figure 1 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 2 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 3 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 4 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Viaarxiv icon

BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning

Add code
Oct 03, 2024
Figure 1 for BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning
Figure 2 for BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning
Figure 3 for BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning
Figure 4 for BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning
Viaarxiv icon