Picture for Zhensong Zhang

Zhensong Zhang

ICo3D: An Interactive Conversational 3D Virtual Human

Add code
Jan 19, 2026
Viaarxiv icon

Map2Thought: Explicit 3D Spatial Reasoning via Metric Cognitive Maps

Add code
Jan 16, 2026
Viaarxiv icon

Optimizing Multimodal LLMs for Egocentric Video Understanding: A Solution for the HD-EPIC VQA Challenge

Add code
Jan 15, 2026
Viaarxiv icon

Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting

Add code
Dec 17, 2025
Figure 1 for Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting
Figure 2 for Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting
Figure 3 for Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting
Figure 4 for Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting
Viaarxiv icon

Charge: A Comprehensive Novel View Synthesis Benchmark and Dataset to Bind Them All

Add code
Dec 15, 2025
Viaarxiv icon

Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation

Add code
Nov 12, 2025
Figure 1 for Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation
Figure 2 for Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation
Figure 3 for Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation
Figure 4 for Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation
Viaarxiv icon

Human Motion Video Generation: A Survey

Add code
Sep 04, 2025
Viaarxiv icon

ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs

Add code
Jun 23, 2025
Viaarxiv icon

Better Together: Unified Motion Capture and 3D Avatar Reconstruction

Add code
Mar 12, 2025
Viaarxiv icon

GASPACHO: Gaussian Splatting for Controllable Humans and Objects

Add code
Mar 12, 2025
Viaarxiv icon