Alert button
Picture for Baoxiong Jia

Baoxiong Jia

Alert button

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

Jan 17, 2024
Baoxiong Jia, Yixin Chen, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang

Viaarxiv icon

An Embodied Generalist Agent in 3D World

Nov 18, 2023
Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, Yan Wang, Qing Li, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang

Figure 1 for An Embodied Generalist Agent in 3D World
Figure 2 for An Embodied Generalist Agent in 3D World
Figure 3 for An Embodied Generalist Agent in 3D World
Figure 4 for An Embodied Generalist Agent in 3D World
Viaarxiv icon

ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab

Nov 01, 2023
Jieming Cui, Ziren Gong, Baoxiong Jia, Siyuan Huang, Zilong Zheng, Jianzhu Ma, Yixin Zhu

Figure 1 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 2 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 3 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Figure 4 for ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Viaarxiv icon

X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events

Aug 21, 2023
Bo Dai, Linge Wang, Baoxiong Jia, Zeyu Zhang, Song-Chun Zhu, Chi Zhang, Yixin Zhu

Figure 1 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 2 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 3 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Figure 4 for X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
Viaarxiv icon

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes

Apr 09, 2023
Ran Gong, Jiangyong Huang, Yizhou Zhao, Haoran Geng, Xiaofeng Gao, Qingyang Wu, Wensi Ai, Ziheng Zhou, Demetri Terzopoulos, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang

Figure 1 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 2 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 3 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Figure 4 for ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Viaarxiv icon

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Jan 15, 2023
Siyuan Huang, Zan Wang, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu

Figure 1 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 2 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 3 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Figure 4 for Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Viaarxiv icon

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation

Nov 28, 2022
Jiangyong Huang, William Yicheng Zhu, Baoxiong Jia, Zan Wang, Xiaojian Ma, Qing Li, Siyuan Huang

Figure 1 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 2 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 3 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Figure 4 for Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Viaarxiv icon

Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention

Oct 17, 2022
Baoxiong Jia, Yu Liu, Siyuan Huang

Figure 1 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 2 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 3 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Figure 4 for Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention
Viaarxiv icon

EgoTaskQA: Understanding Human Tasks in Egocentric Videos

Oct 08, 2022
Baoxiong Jia, Ting Lei, Song-Chun Zhu, Siyuan Huang

Figure 1 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 2 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 3 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Figure 4 for EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Viaarxiv icon

Latent Diffusion Energy-Based Model for Interpretable Text Modeling

Jun 14, 2022
Peiyu Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Song-Chun Zhu, Ying Nian Wu

Figure 1 for Latent Diffusion Energy-Based Model for Interpretable Text Modeling
Figure 2 for Latent Diffusion Energy-Based Model for Interpretable Text Modeling
Figure 3 for Latent Diffusion Energy-Based Model for Interpretable Text Modeling
Figure 4 for Latent Diffusion Energy-Based Model for Interpretable Text Modeling
Viaarxiv icon