Picture for Zhedong Zheng

Zhedong Zheng

VSearcher: Long-Horizon Multimodal Search Agent via Reinforcement Learning

Add code
Mar 03, 2026
Viaarxiv icon

Process Over Outcome: Cultivating Forensic Reasoning for Generalizable Multimodal Manipulation Detection

Add code
Mar 02, 2026
Viaarxiv icon

From Instruction to Event: Sound-Triggered Mobile Manipulation

Add code
Jan 29, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models

Add code
Jan 06, 2026
Viaarxiv icon

AnomalyLMM: Bridging Generative Knowledge and Discriminative Retrieval for Text-Based Person Anomaly Search

Add code
Sep 04, 2025
Viaarxiv icon

Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models

Add code
Jun 09, 2025
Figure 1 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 2 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 3 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 4 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Viaarxiv icon

Echo Planning for Autonomous Driving: From Current Observations to Future Trajectories and Back

Add code
May 25, 2025
Viaarxiv icon

CAMeL: Cross-modality Adaptive Meta-Learning for Text-based Person Retrieval

Add code
Apr 26, 2025
Viaarxiv icon

Every Painting Awakened: A Training-free Framework for Painting-to-Animation Generation

Add code
Mar 31, 2025
Viaarxiv icon