Picture for Ying Shan

Ying Shan

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Add code
Mar 05, 2026
Viaarxiv icon

CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

Add code
Mar 04, 2026
Viaarxiv icon

MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Add code
Feb 09, 2026
Viaarxiv icon

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Add code
Jan 08, 2026
Viaarxiv icon

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Add code
Dec 23, 2025
Viaarxiv icon

MMhops-R1: Multimodal Multi-hop Reasoning

Add code
Dec 16, 2025
Figure 1 for MMhops-R1: Multimodal Multi-hop Reasoning
Figure 2 for MMhops-R1: Multimodal Multi-hop Reasoning
Figure 3 for MMhops-R1: Multimodal Multi-hop Reasoning
Figure 4 for MMhops-R1: Multimodal Multi-hop Reasoning
Viaarxiv icon

TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs

Add code
Dec 16, 2025
Viaarxiv icon

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

Add code
Nov 18, 2025
Figure 1 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 2 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 3 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 4 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Viaarxiv icon

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Add code
Aug 27, 2025
Figure 1 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 2 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 3 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 4 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Viaarxiv icon

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Add code
Aug 14, 2025
Figure 1 for ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Figure 2 for ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Figure 3 for ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Figure 4 for ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Viaarxiv icon