Picture for Di Zhang

Di Zhang

GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation

Add code
Jun 26, 2025
Viaarxiv icon

Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation

Add code
Jun 24, 2025
Viaarxiv icon

FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation

Add code
Jun 23, 2025
Viaarxiv icon

SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition

Add code
Jun 09, 2025
Viaarxiv icon

FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Add code
Jun 05, 2025
Viaarxiv icon

UNIC: Unified In-Context Video Editing

Add code
Jun 04, 2025
Viaarxiv icon

The Butterfly Effect in Pathology: Exploring Security in Pathology Foundation Models

Add code
May 30, 2025
Viaarxiv icon

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Add code
May 27, 2025
Viaarxiv icon

Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter

Add code
May 24, 2025
Viaarxiv icon

MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback

Add code
May 23, 2025
Viaarxiv icon