Picture for Yanhua Long

Yanhua Long

Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement

Add code
May 18, 2025
Viaarxiv icon

Exploring the Potential of SSL Models for Sound Event Detection

Add code
May 17, 2025
Viaarxiv icon

SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation

Add code
Jan 20, 2025
Viaarxiv icon

ICSD: An Open-source Dataset for Infant Cry and Snoring Detection

Add code
Aug 20, 2024
Figure 1 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 2 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 3 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Figure 4 for ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Viaarxiv icon

Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection

Add code
Nov 15, 2023
Figure 1 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 2 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 3 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Viaarxiv icon

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Add code
Aug 24, 2023
Figure 1 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023
Figure 2 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023
Figure 3 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023
Figure 4 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023
Viaarxiv icon

Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition

Add code
Jun 20, 2023
Figure 1 for Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Figure 2 for Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Figure 3 for Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Figure 4 for Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Viaarxiv icon

Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement

Add code
Nov 22, 2022
Figure 1 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 2 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 3 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 4 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Viaarxiv icon

Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system

Add code
Nov 03, 2022
Figure 1 for Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Figure 2 for Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Figure 3 for Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Figure 4 for Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Viaarxiv icon

DiaCorrect: End-to-end error correction for speaker diarization

Add code
Oct 31, 2022
Figure 1 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 2 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 3 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 4 for DiaCorrect: End-to-end error correction for speaker diarization
Viaarxiv icon