Picture for Zhedong Zheng

Zhedong Zheng

UAVReason: A Unified, Large-Scale Benchmark for Multimodal Aerial Scene Reasoning and Generation

Add code
Apr 07, 2026
Viaarxiv icon

Can Video Diffusion Models Predict Past Frames? Bidirectional Cycle Consistency for Reversible Interpolation

Add code
Apr 02, 2026
Viaarxiv icon

Uncertainty-Aware Trajectory Prediction: A Unified Framework Harnessing Positional and Semantic Uncertainties

Add code
Mar 31, 2026
Viaarxiv icon

Look, Compare and Draw: Differential Query Transformer for Automatic Oil Painting

Add code
Mar 29, 2026
Viaarxiv icon

VSearcher: Long-Horizon Multimodal Search Agent via Reinforcement Learning

Add code
Mar 03, 2026
Viaarxiv icon

Process Over Outcome: Cultivating Forensic Reasoning for Generalizable Multimodal Manipulation Detection

Add code
Mar 02, 2026
Viaarxiv icon

From Instruction to Event: Sound-Triggered Mobile Manipulation

Add code
Jan 29, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models

Add code
Jan 06, 2026
Viaarxiv icon

AnomalyLMM: Bridging Generative Knowledge and Discriminative Retrieval for Text-Based Person Anomaly Search

Add code
Sep 04, 2025
Viaarxiv icon