Picture for Yanhua Long

Yanhua Long

Zipper-LoRA: Dynamic Parameter Decoupling for Speech-LLM based Multilingual Speech Recognition

Add code
Mar 19, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon

Bridging the gap: A comparative exploration of Speech-LLM and end-to-end architecture for multilingual conversational ASR

Add code
Jan 04, 2026
Viaarxiv icon

A Language-Agnostic Hierarchical LoRA-MoE Architecture for CTC-based Multilingual ASR

Add code
Jan 02, 2026
Viaarxiv icon

Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios

Add code
Aug 27, 2025
Figure 1 for Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios
Figure 2 for Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios
Figure 3 for Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios
Figure 4 for Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios
Viaarxiv icon

Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement

Add code
May 18, 2025
Figure 1 for Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Figure 2 for Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Figure 3 for Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Figure 4 for Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Viaarxiv icon

Exploring the Potential of SSL Models for Sound Event Detection

Add code
May 17, 2025
Viaarxiv icon

SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation

Add code
Jan 20, 2025
Figure 1 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 2 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 3 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Figure 4 for SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
Viaarxiv icon

ICSD: An Open-source Dataset for Infant Cry and Snoring Detection

Add code
Aug 20, 2024
Figure 1 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 2 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 3 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 4 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Viaarxiv icon

Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection

Add code
Nov 15, 2023
Figure 1 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 2 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 3 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Viaarxiv icon