Alert button
Picture for Tao Kong

Tao Kong

Alert button

SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction

Add code
Bookmark button
Alert button
Feb 20, 2024
Jie Xu, Hanbo Zhang, Xinghang Li, Huaping Liu, Xuguang Lan, Tao Kong

Viaarxiv icon

Towards Unified Interactive Visual Grounding in The Wild

Add code
Bookmark button
Alert button
Jan 30, 2024
Jie Xu, Hanbo Zhang, Qingyi Si, Yifeng Li, Xuguang Lan, Tao Kong

Viaarxiv icon

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Add code
Bookmark button
Alert button
Dec 21, 2023
Hongtao Wu, Ya Jing, Chilam Cheang, Guangzeng Chen, Jiafeng Xu, Xinghang Li, Minghuan Liu, Hang Li, Tao Kong

Viaarxiv icon

Vision-Language Foundation Models as Effective Robot Imitators

Add code
Bookmark button
Alert button
Nov 06, 2023
Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu, Hang Li, Tao Kong

Viaarxiv icon

InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions

Add code
Bookmark button
Alert button
Oct 18, 2023
Hanbo Zhang, Jie Xu, Yuchen Mo, Tao Kong

Viaarxiv icon

MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation

Add code
Bookmark button
Alert button
Aug 07, 2023
Taozheng Yang, Ya Jing, Hongtao Wu, Jiafeng Xu, Kuankuan Sima, Guangzeng Chen, Qie Sima, Tao Kong

Figure 1 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 2 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 3 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 4 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Viaarxiv icon

Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods

Add code
Bookmark button
Alert button
Aug 07, 2023
Ya Jing, Xuelin Zhu, Xingbin Liu, Qie Sima, Taozheng Yang, Yunhai Feng, Tao Kong

Figure 1 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 2 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 3 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 4 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Viaarxiv icon

What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?

Add code
Bookmark button
Alert button
Jul 30, 2023
Yan Zeng, Hanbo Zhang, Jiani Zheng, Jiangnan Xia, Guoqiang Wei, Yang Wei, Yuchen Zhang, Tao Kong

Figure 1 for What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Figure 2 for What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Figure 3 for What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Figure 4 for What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Viaarxiv icon

ClickSeg: 3D Instance Segmentation with Click-Level Weak Annotations

Add code
Bookmark button
Alert button
Jul 19, 2023
Leyao Liu, Tao Kong, Minzhao Zhu, Jiashuo Fan, Lu Fang

Figure 1 for ClickSeg: 3D Instance Segmentation with Click-Level Weak Annotations
Figure 2 for ClickSeg: 3D Instance Segmentation with Click-Level Weak Annotations
Figure 3 for ClickSeg: 3D Instance Segmentation with Click-Level Weak Annotations
Figure 4 for ClickSeg: 3D Instance Segmentation with Click-Level Weak Annotations
Viaarxiv icon