Picture for Yuying Ge

Yuying Ge

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Add code
Dec 23, 2025
Viaarxiv icon

TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs

Add code
Dec 16, 2025
Viaarxiv icon

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

Add code
Nov 18, 2025
Figure 1 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 2 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 3 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Figure 4 for ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Viaarxiv icon

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Add code
Aug 27, 2025
Figure 1 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 2 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 3 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 4 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Viaarxiv icon

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Add code
Jul 28, 2025
Viaarxiv icon

Aligning Latent Spaces with Flow Priors

Add code
Jun 05, 2025
Viaarxiv icon

Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Add code
May 27, 2025
Viaarxiv icon

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Add code
May 08, 2025
Viaarxiv icon

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Add code
Apr 01, 2025
Viaarxiv icon

Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1

Add code
Mar 31, 2025
Figure 1 for Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Figure 2 for Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Figure 3 for Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Figure 4 for Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Viaarxiv icon