Picture for Leyuan Qu

Leyuan Qu

From Coarse to Nuanced: Cross-Modal Alignment of Fine-Grained Linguistic Cues and Visual Salient Regions for Dynamic Emotion Recognition

Add code
Jul 16, 2025
Viaarxiv icon

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

Add code
Feb 21, 2023
Viaarxiv icon

Disentangling Prosody Representations with Unsupervised Speech Reconstruction

Add code
Dec 14, 2022
Figure 1 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 2 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 3 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 4 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Viaarxiv icon

Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition

Add code
Nov 16, 2022
Figure 1 for Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition
Figure 2 for Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition
Figure 3 for Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition
Figure 4 for Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition
Viaarxiv icon

A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning

Add code
Feb 27, 2022
Figure 1 for A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
Figure 2 for A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
Figure 3 for A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
Figure 4 for A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
Viaarxiv icon

LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading

Add code
Dec 09, 2021
Figure 1 for LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading
Figure 2 for LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading
Figure 3 for LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading
Figure 4 for LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading
Viaarxiv icon