Picture for Shoubin Yu

Shoubin Yu

VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting

Add code
Mar 15, 2026
Viaarxiv icon

Balancing Faithfulness and Performance in Reasoning via Multi-Listener Soft Execution

Add code
Feb 18, 2026
Viaarxiv icon

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

Add code
Feb 09, 2026
Viaarxiv icon

Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning

Add code
Jul 09, 2025
Viaarxiv icon

4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

Add code
Jun 23, 2025
Viaarxiv icon

Movie Facts and Fibs (MF$^2$): A Benchmark for Long Movie Understanding

Add code
Jun 06, 2025
Viaarxiv icon

Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization

Add code
Apr 11, 2025
Viaarxiv icon

VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation

Add code
Mar 19, 2025
Viaarxiv icon

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Add code
Dec 11, 2024
Figure 1 for Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Figure 2 for Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Figure 3 for Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Figure 4 for Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Viaarxiv icon

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

Add code
Nov 15, 2024
Figure 1 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 2 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 3 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 4 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Viaarxiv icon