Picture for Shi-Xiong Zhang

Shi-Xiong Zhang

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

Add code
Jun 17, 2024
Figure 1 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 2 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 3 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 4 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Viaarxiv icon

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Add code
Oct 31, 2023
Figure 1 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 2 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 3 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 4 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Viaarxiv icon

UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing

Add code
Oct 25, 2023
Figure 1 for UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing
Figure 2 for UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing
Figure 3 for UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing
Figure 4 for UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing
Viaarxiv icon

M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec

Add code
Sep 23, 2023
Figure 1 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 2 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 3 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 4 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Viaarxiv icon

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

Add code
Mar 11, 2023
Figure 1 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 2 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 3 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 4 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Viaarxiv icon

3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty

Add code
Feb 27, 2023
Figure 1 for 3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty
Figure 2 for 3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty
Figure 3 for 3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty
Figure 4 for 3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty
Viaarxiv icon

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Add code
Dec 24, 2022
Figure 1 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 2 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 3 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 4 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Viaarxiv icon

Deep Neural Mel-Subband Beamformer for In-car Speech Separation

Add code
Nov 22, 2022
Figure 1 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Figure 2 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Figure 3 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Viaarxiv icon

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement

Add code
May 20, 2022
Figure 1 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 2 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 3 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 4 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Viaarxiv icon

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

Add code
Mar 31, 2022
Figure 1 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 2 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 3 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 4 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Viaarxiv icon