Picture for Quanlong Zheng

Quanlong Zheng

X-OmniClaw Technical Report: A Unified Mobile Agent for Multimodal Understanding and Interaction

Add code
May 07, 2026
Viaarxiv icon

Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM

Add code
May 26, 2025
Figure 1 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 2 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 3 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 4 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Viaarxiv icon

H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding

Add code
Mar 31, 2025
Viaarxiv icon