Picture for Lingdong Kong

Lingdong Kong

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing

Add code
Dec 12, 2025
Viaarxiv icon

WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Add code
Dec 11, 2025
Viaarxiv icon

Learning to Remove Lens Flare in Event Camera

Add code
Dec 09, 2025
Viaarxiv icon

SEE4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting

Add code
Oct 30, 2025
Viaarxiv icon

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Add code
Oct 23, 2025
Viaarxiv icon

VideoLucy: Deep Memory Backtracking for Long Video Understanding

Add code
Oct 14, 2025
Figure 1 for VideoLucy: Deep Memory Backtracking for Long Video Understanding
Figure 2 for VideoLucy: Deep Memory Backtracking for Long Video Understanding
Figure 3 for VideoLucy: Deep Memory Backtracking for Long Video Understanding
Figure 4 for VideoLucy: Deep Memory Backtracking for Long Video Understanding
Viaarxiv icon

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Add code
Oct 02, 2025
Figure 1 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 2 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 3 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 4 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Viaarxiv icon

Visual Grounding from Event Cameras

Add code
Sep 11, 2025
Viaarxiv icon

Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras

Add code
Jul 23, 2025
Viaarxiv icon