Picture for Zhengzhong Tu

Zhengzhong Tu

Ben

FlowSteer: Conditioning Flow Field for Consistent Image Restoration

Add code
Dec 09, 2025
Viaarxiv icon

3D4D: An Interactive, Editable, 4D World Model via 3D Video Generation

Add code
Nov 11, 2025
Viaarxiv icon

Background Fades, Foreground Leads: Curriculum-Guided Background Pruning for Efficient Foreground-Centric Collaborative Perception

Add code
Oct 22, 2025
Viaarxiv icon

SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling

Add code
Aug 25, 2025
Viaarxiv icon

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

Add code
Jul 16, 2025
Figure 1 for MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
Figure 2 for MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
Figure 3 for MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
Figure 4 for MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
Viaarxiv icon

4KAgent: Agentic Any Image to 4K Super-Resolution

Add code
Jul 09, 2025
Viaarxiv icon

A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality

Add code
Jul 09, 2025
Figure 1 for A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality
Figure 2 for A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality
Viaarxiv icon

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Add code
Jun 24, 2025
Viaarxiv icon

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Add code
Jun 18, 2025
Viaarxiv icon

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Add code
Jun 09, 2025
Viaarxiv icon