Picture for Dengming Zhang

Dengming Zhang

Learning to Hear by Seeing: It's Time for Vision Language Models to Understand Artistic Emotion from Sight and Sound

Add code
Nov 15, 2025
Viaarxiv icon

Controllable Video-to-Music Generation with Multiple Time-Varying Conditions

Add code
Jul 28, 2025
Viaarxiv icon

Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning

Add code
Dec 26, 2024
Figure 1 for Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning
Figure 2 for Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning
Figure 3 for Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning
Figure 4 for Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning
Viaarxiv icon

FonTS: Text Rendering with Typography and Style Controls

Add code
Nov 28, 2024
Viaarxiv icon

UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention

Add code
Dec 07, 2022
Figure 1 for UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention
Figure 2 for UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention
Figure 3 for UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention
Figure 4 for UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention
Viaarxiv icon