Picture for Xugang Lu

Xugang Lu

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions

Add code
Dec 21, 2023
Figure 1 for Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
Figure 2 for Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
Figure 3 for Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
Figure 4 for Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
Viaarxiv icon

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Add code
Dec 18, 2023
Viaarxiv icon

Neural domain alignment for spoken language recognition based on optimal transport

Add code
Oct 20, 2023
Figure 1 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 2 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 3 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 4 for Neural domain alignment for spoken language recognition based on optimal transport
Viaarxiv icon

Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR

Add code
Sep 28, 2023
Figure 1 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 2 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 3 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 4 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Viaarxiv icon

Cross-modal Alignment with Optimal Transport for CTC-based ASR

Add code
Sep 24, 2023
Figure 1 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Figure 2 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Figure 3 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Viaarxiv icon

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition

Add code
Jul 29, 2022
Figure 1 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 2 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 3 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 4 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Viaarxiv icon

Transducer-based language embedding for spoken language identification

Add code
Apr 08, 2022
Figure 1 for Transducer-based language embedding for spoken language identification
Figure 2 for Transducer-based language embedding for spoken language identification
Figure 3 for Transducer-based language embedding for spoken language identification
Viaarxiv icon

Perceptual Contrast Stretching on Target Feature for Speech Enhancement

Add code
Apr 01, 2022
Figure 1 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 2 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 3 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 4 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Viaarxiv icon

Partial Coupling of Optimal Transport for Spoken Language Identification

Add code
Mar 31, 2022
Figure 1 for Partial Coupling of Optimal Transport for Spoken Language Identification
Figure 2 for Partial Coupling of Optimal Transport for Spoken Language Identification
Figure 3 for Partial Coupling of Optimal Transport for Spoken Language Identification
Figure 4 for Partial Coupling of Optimal Transport for Spoken Language Identification
Viaarxiv icon

TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding

Add code
Mar 17, 2022
Figure 1 for TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Figure 2 for TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Figure 3 for TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Figure 4 for TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Viaarxiv icon