Picture for Zhedong Zheng

Zhedong Zheng

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models

Add code
Jan 06, 2026
Viaarxiv icon

AnomalyLMM: Bridging Generative Knowledge and Discriminative Retrieval for Text-Based Person Anomaly Search

Add code
Sep 04, 2025
Viaarxiv icon

Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models

Add code
Jun 09, 2025
Figure 1 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 2 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 3 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Figure 4 for Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Viaarxiv icon

Echo Planning for Autonomous Driving: From Current Observations to Future Trajectories and Back

Add code
May 25, 2025
Viaarxiv icon

CAMeL: Cross-modality Adaptive Meta-Learning for Text-based Person Retrieval

Add code
Apr 26, 2025
Viaarxiv icon

Every Painting Awakened: A Training-free Framework for Painting-to-Animation Generation

Add code
Mar 31, 2025
Viaarxiv icon

A Large-scale Interpretable Multi-modality Benchmark for Facial Image Forgery Localization

Add code
Dec 27, 2024
Viaarxiv icon

Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization

Add code
Dec 23, 2024
Figure 1 for Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
Figure 2 for Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
Figure 3 for Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
Figure 4 for Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
Viaarxiv icon

CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution

Add code
Dec 16, 2024
Viaarxiv icon