Picture for Fan Tang

Fan Tang

Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning

Add code
Feb 01, 2026
Viaarxiv icon

TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts

Add code
Jan 12, 2026
Viaarxiv icon

Sissi: Zero-shot Style-guided Image Synthesis via Semantic-style Integration

Add code
Jan 10, 2026
Viaarxiv icon

HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation

Add code
Jun 10, 2025
Viaarxiv icon

In-Context Brush: Zero-shot Customized Subject Insertion with Context-Aware Latent Space Manipulation

Add code
May 26, 2025
Viaarxiv icon

Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration

Add code
May 16, 2025
Viaarxiv icon

Multi-turn Consistent Image Editing

Add code
May 07, 2025
Viaarxiv icon

A Survey on Cross-Modal Interaction Between Music and Multimodal Data

Add code
Apr 17, 2025
Figure 1 for A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Figure 2 for A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Figure 3 for A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Figure 4 for A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Viaarxiv icon

Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning

Add code
Apr 16, 2025
Viaarxiv icon

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Add code
Mar 15, 2025
Viaarxiv icon