Picture for Peihao Chen

Peihao Chen

CoNav: A Benchmark for Human-Centered Collaborative Navigation

Add code
Jun 04, 2024
Viaarxiv icon

MAGIC: Map-Guided Few-Shot Audio-Visual Acoustics Modeling

Add code
May 22, 2024
Viaarxiv icon

3D-VLA: A 3D Vision-Language-Action Generative World Model

Add code
Mar 14, 2024
Figure 1 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 2 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 3 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 4 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Viaarxiv icon

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Add code
Jan 16, 2024
Viaarxiv icon

A Simple Knowledge Distillation Framework for Open-world Object Detection

Add code
Dec 14, 2023
Viaarxiv icon

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

Add code
Dec 10, 2023
Figure 1 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 2 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 3 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 4 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Viaarxiv icon

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Add code
Nov 06, 2023
Viaarxiv icon

FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation

Add code
Oct 11, 2023
Figure 1 for FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
Figure 2 for FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
Figure 3 for FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
Figure 4 for FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
Viaarxiv icon

$A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models

Add code
Aug 15, 2023
Figure 1 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 2 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 3 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 4 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Viaarxiv icon

3D-LLM: Injecting the 3D World into Large Language Models

Add code
Jul 24, 2023
Viaarxiv icon