Picture for Meng Chu

Meng Chu

SearchGym: Bootstrapping Real-World Search Agents via Cost-Effective and High-Fidelity Environment Simulation

Add code
Jan 21, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis

Add code
Dec 22, 2025
Viaarxiv icon

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Add code
Jun 12, 2025
Viaarxiv icon

TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance

Add code
Apr 23, 2025
Viaarxiv icon

Understanding Long Videos via LLM-Powered Entity Relation Graphs

Add code
Jan 27, 2025
Figure 1 for Understanding Long Videos via LLM-Powered Entity Relation Graphs
Figure 2 for Understanding Long Videos via LLM-Powered Entity Relation Graphs
Figure 3 for Understanding Long Videos via LLM-Powered Entity Relation Graphs
Figure 4 for Understanding Long Videos via LLM-Powered Entity Relation Graphs
Viaarxiv icon

IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis

Add code
Sep 17, 2024
Viaarxiv icon

Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching

Add code
Nov 21, 2023
Figure 1 for Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching
Figure 2 for Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching
Figure 3 for Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching
Figure 4 for Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching
Viaarxiv icon