Picture for Shiliang Zhang

Shiliang Zhang

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Add code
Jan 14, 2026
Viaarxiv icon

MoCo: Motion-Consistent Human Video Generation via Structure-Appearance Decoupling

Add code
Aug 24, 2025
Viaarxiv icon

NN-Former: Rethinking Graph Structure in Neural Architecture Representation

Add code
Jul 01, 2025
Figure 1 for NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Figure 2 for NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Figure 3 for NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Viaarxiv icon

MagCache: Fast Video Generation with Magnitude-Aware Cache

Add code
Jun 10, 2025
Viaarxiv icon

Efficient Multi-modal Long Context Learning for Training-free Adaptation

Add code
May 26, 2025
Viaarxiv icon

CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training

Add code
May 23, 2025
Viaarxiv icon

Adaptive Fault-tolerant Control of Underwater Vehicles with Thruster Failures

Add code
Apr 22, 2025
Figure 1 for Adaptive Fault-tolerant Control of Underwater Vehicles with Thruster Failures
Figure 2 for Adaptive Fault-tolerant Control of Underwater Vehicles with Thruster Failures
Figure 3 for Adaptive Fault-tolerant Control of Underwater Vehicles with Thruster Failures
Figure 4 for Adaptive Fault-tolerant Control of Underwater Vehicles with Thruster Failures
Viaarxiv icon

OmniAudio: Generating Spatial Audio from 360-Degree Video

Add code
Apr 21, 2025
Figure 1 for OmniAudio: Generating Spatial Audio from 360-Degree Video
Figure 2 for OmniAudio: Generating Spatial Audio from 360-Degree Video
Figure 3 for OmniAudio: Generating Spatial Audio from 360-Degree Video
Figure 4 for OmniAudio: Generating Spatial Audio from 360-Degree Video
Viaarxiv icon

Evolved Hierarchical Masking for Self-Supervised Learning

Add code
Apr 12, 2025
Viaarxiv icon

Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval

Add code
Apr 10, 2025
Viaarxiv icon