Picture for Shuwei He

Shuwei He

MoE Adapter for Large Audio Language Models: Sparsity, Disentanglement, and Gradient-Conflict-Free

Add code
Jan 08, 2026
Viaarxiv icon

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

Add code
Dec 17, 2024
Viaarxiv icon

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

Add code
Oct 18, 2024
Figure 1 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Figure 2 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Figure 3 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Viaarxiv icon