Picture for Zhihang Zhong

Zhihang Zhong

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Add code
Mar 12, 2026
Viaarxiv icon

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

Add code
Mar 10, 2026
Viaarxiv icon

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Add code
Mar 10, 2026
Viaarxiv icon

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

Add code
Mar 08, 2026
Viaarxiv icon

RISE-Video: Can Video Generators Decode Implicit World Rules?

Add code
Feb 05, 2026
Viaarxiv icon

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Add code
Dec 09, 2025
Viaarxiv icon

AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models

Add code
May 26, 2025
Viaarxiv icon

CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction

Add code
Mar 29, 2025
Viaarxiv icon

R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars

Add code
Mar 17, 2025
Figure 1 for R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Figure 2 for R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Figure 3 for R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Figure 4 for R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Viaarxiv icon

SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic

Add code
Mar 09, 2025
Viaarxiv icon