Picture for Jiazhao Zhang

Jiazhao Zhang

OctoNav: Towards Generalist Embodied Navigation

Add code
Jun 11, 2025
Viaarxiv icon

TrackVLA: Embodied Visual Tracking in the Wild

Add code
May 29, 2025
Viaarxiv icon

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Add code
Apr 26, 2025
Viaarxiv icon

OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging

Add code
Mar 03, 2025
Viaarxiv icon

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Add code
Feb 18, 2025
Viaarxiv icon

Neural Observation Field Guided Hybrid Optimization of Camera Placement

Add code
Dec 11, 2024
Viaarxiv icon

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

Add code
Dec 11, 2024
Viaarxiv icon

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

Add code
Dec 09, 2024
Viaarxiv icon

GAPartManip: A Large-scale Part-centric Dataset for Material-Agnostic Articulated Object Manipulation

Add code
Nov 27, 2024
Viaarxiv icon

InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models

Add code
Nov 18, 2024
Figure 1 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 2 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 3 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 4 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Viaarxiv icon