Alert button
Picture for Baoxiong Jia

Baoxiong Jia

Alert button

Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance

Add code
Bookmark button
Alert button
Mar 26, 2024
Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang

Viaarxiv icon

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

Add code
Bookmark button
Alert button
Jan 17, 2024
Baoxiong Jia, Yixin Chen, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang

Viaarxiv icon

An Embodied Generalist Agent in 3D World

Add code
Bookmark button
Alert button
Nov 18, 2023
Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, Yan Wang, Qing Li, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang

Figure 1 for An Embodied Generalist Agent in 3D World
Figure 2 for An Embodied Generalist Agent in 3D World
Figure 3 for An Embodied Generalist Agent in 3D World
Figure 4 for An Embodied Generalist Agent in 3D World
Viaarxiv icon

ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab

Add code
Bookmark button
Alert button
Nov 01, 2023
Jieming Cui, Ziren Gong, Baoxiong Jia, Siyuan Huang, Zilong Zheng, Jianzhu Ma, Yixin Zhu

Figure 1 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 2 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 3 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 4 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Viaarxiv icon

X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events

Add code
Bookmark button
Alert button
Aug 21, 2023
Bo Dai, Linge Wang, Baoxiong Jia, Zeyu Zhang, Song-Chun Zhu, Chi Zhang, Yixin Zhu

Figure 1 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 2 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 3 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 4 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Viaarxiv icon

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes

Add code
Bookmark button
Alert button
Apr 09, 2023
Ran Gong, Jiangyong Huang, Yizhou Zhao, Haoran Geng, Xiaofeng Gao, Qingyang Wu, Wensi Ai, Ziheng Zhou, Demetri Terzopoulos, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang

Figure 1 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 2 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 3 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 4 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Viaarxiv icon

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Add code
Bookmark button
Alert button
Jan 15, 2023
Siyuan Huang, Zan Wang, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu

Figure 1 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 2 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 3 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 4 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Viaarxiv icon

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation

Add code
Bookmark button
Alert button
Nov 28, 2022
Jiangyong Huang, William Yicheng Zhu, Baoxiong Jia, Zan Wang, Xiaojian Ma, Qing Li, Siyuan Huang

Figure 1 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 2 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 3 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 4 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Viaarxiv icon

Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention

Add code
Bookmark button
Alert button
Oct 17, 2022
Baoxiong Jia, Yu Liu, Siyuan Huang

Figure 1 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 2 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 3 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 4 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Viaarxiv icon

EgoTaskQA: Understanding Human Tasks in Egocentric Videos

Add code
Bookmark button
Alert button
Oct 08, 2022
Baoxiong Jia, Ting Lei, Song-Chun Zhu, Siyuan Huang

Figure 1 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 2 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 3 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 4 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Viaarxiv icon