Alert button
Picture for Peihao Chen

Peihao Chen

Alert button

3D-VLA: A 3D Vision-Language-Action Generative World Model

Mar 14, 2024
Haoyu Zhen, Xiaowen Qiu, Peihao Chen, Jincheng Yang, Xin Yan, Yilun Du, Yining Hong, Chuang Gan

Viaarxiv icon

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Jan 16, 2024
Yining Hong, Zishuo Zheng, Peihao Chen, Yian Wang, Junyan Li, Chuang Gan

Viaarxiv icon

A Simple Knowledge Distillation Framework for Open-world Object Detection

Dec 14, 2023
Shuailei Ma, Yuefeng Wang, Ying Wei, Jiaqi Fan, Xinyu Sun, Peihao Chen, Enming Zhang

Viaarxiv icon

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

Dec 10, 2023
Kunyang Lin, Yufeng Wang, Peihao Chen, Runhao Zeng, Siyuan Zhou, Mingkui Tan, Chuang Gan

Figure 1 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 2 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 3 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Figure 4 for DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning
Viaarxiv icon

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Nov 06, 2023
Junyan Li, Delin Chen, Yining Hong, Zhenfang Chen, Peihao Chen, Yikang Shen, Chuang Gan

Viaarxiv icon

FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation

Oct 11, 2023
Xinyu Sun, Peihao Chen, Jugang Fan, Thomas H. Li, Jian Chen, Mingkui Tan

Viaarxiv icon

$A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models

Aug 15, 2023
Peihao Chen, Xinyu Sun, Hongyan Zhi, Runhao Zeng, Thomas H. Li, Gaowen Liu, Mingkui Tan, Chuang Gan

Figure 1 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 2 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 3 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Figure 4 for $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
Viaarxiv icon

3D-LLM: Injecting the 3D World into Large Language Models

Jul 24, 2023
Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, Chuang Gan

Viaarxiv icon

Learning Vision-and-Language Navigation from YouTube Videos

Jul 22, 2023
Kunyang Lin, Peihao Chen, Diwei Huang, Thomas H. Li, Mingkui Tan, Chuang Gan

Figure 1 for Learning Vision-and-Language Navigation from YouTube Videos
Figure 2 for Learning Vision-and-Language Navigation from YouTube Videos
Figure 3 for Learning Vision-and-Language Navigation from YouTube Videos
Figure 4 for Learning Vision-and-Language Navigation from YouTube Videos
Viaarxiv icon

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition

Jul 20, 2023
Weidong Chen, Xiaofen Xing, Peihao Chen, Xiangmin Xu

Figure 1 for Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Figure 2 for Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Figure 3 for Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Figure 4 for Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Viaarxiv icon