Picture for Yuguang Yang

Yuguang Yang

DecomCAM: Advancing Beyond Saliency Maps through Decomposition and Integration

Add code
May 29, 2024
Viaarxiv icon

GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT

Add code
May 03, 2024
Figure 1 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 2 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 3 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 4 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Viaarxiv icon

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

Add code
Sep 28, 2023
Figure 1 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 2 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 3 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 4 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Viaarxiv icon

PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts

Add code
Sep 17, 2023
Figure 1 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 2 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 3 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 4 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Viaarxiv icon

GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition

Add code
Jun 16, 2023
Figure 1 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 2 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 3 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Viaarxiv icon

Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models

Add code
Jun 11, 2023
Figure 1 for Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models
Figure 2 for Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models
Figure 3 for Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models
Figure 4 for Self-Enhancement Improves Text-Image Retrieval in Foundation Visual-Language Models
Viaarxiv icon

Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map

Add code
May 27, 2023
Figure 1 for Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
Figure 2 for Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
Figure 3 for Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
Figure 4 for Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
Viaarxiv icon

HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism

Add code
Mar 15, 2023
Figure 1 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 2 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 3 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 4 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Viaarxiv icon

LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition

Add code
Dec 05, 2022
Figure 1 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 2 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 3 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 4 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Viaarxiv icon

Improving fairness in speaker verification via Group-adapted Fusion Network

Add code
Feb 23, 2022
Figure 1 for Improving fairness in speaker verification via Group-adapted Fusion Network
Figure 2 for Improving fairness in speaker verification via Group-adapted Fusion Network
Figure 3 for Improving fairness in speaker verification via Group-adapted Fusion Network
Figure 4 for Improving fairness in speaker verification via Group-adapted Fusion Network
Viaarxiv icon