Picture for Yinfeng Yu

Yinfeng Yu

Residual Cross-Modal Fusion Networks for Audio-Visual Navigation

Add code
Jan 11, 2026
Viaarxiv icon

DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation

Add code
Apr 30, 2025
Figure 1 for DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Figure 2 for DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Figure 3 for DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Figure 4 for DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Viaarxiv icon

DGFNet: End-to-End Audio-Visual Source Separation Based on Dynamic Gating Fusion

Add code
Apr 30, 2025
Viaarxiv icon

AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis

Add code
Apr 12, 2025
Figure 1 for AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Figure 2 for AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Figure 3 for AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Figure 4 for AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Viaarxiv icon

Leveraging Label Potential for Enhanced Multimodal Emotion Recognition

Add code
Apr 07, 2025
Viaarxiv icon

Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting

Add code
Mar 27, 2025
Figure 1 for Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
Figure 2 for Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
Figure 3 for Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
Figure 4 for Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
Viaarxiv icon

Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis

Add code
Jan 07, 2025
Figure 1 for Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis
Figure 2 for Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis
Figure 3 for Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis
Figure 4 for Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis
Viaarxiv icon

Heterogeneous Space Fusion and Dual-Dimension Attention: A New Paradigm for Speech Enhancement

Add code
Aug 13, 2024
Viaarxiv icon

VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders

Add code
Aug 13, 2024
Figure 1 for VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Figure 2 for VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Figure 3 for VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Figure 4 for VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Viaarxiv icon

BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network based on Self-Supervised Embedding

Add code
Aug 13, 2024
Viaarxiv icon