Picture for Wenzhe Cai

Wenzhe Cai

Nimbus: A Unified Embodied Synthetic Data Generation Framework

Add code
Jan 29, 2026
Viaarxiv icon

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

Add code
Dec 23, 2025
Viaarxiv icon

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

Add code
Dec 09, 2025
Viaarxiv icon

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

Add code
May 13, 2025
Viaarxiv icon

ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination

Add code
Oct 13, 2024
Figure 1 for ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Figure 2 for ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Figure 3 for ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Figure 4 for ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Viaarxiv icon

MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for Multi-object Demand-driven Navigation

Add code
Oct 04, 2024
Viaarxiv icon

InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment

Add code
Jun 07, 2024
Figure 1 for InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
Figure 2 for InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
Figure 3 for InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
Figure 4 for InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
Viaarxiv icon

Empowering Large Language Models on Robotic Manipulation with Affordance Prompting

Add code
Apr 17, 2024
Figure 1 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 2 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 3 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 4 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Viaarxiv icon

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

Add code
Dec 25, 2023
Viaarxiv icon

DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic Graph Memory

Add code
Nov 30, 2023
Viaarxiv icon