Picture for Xiaoqi Li

Xiaoqi Li

Human-centered In-building Embodied Delivery Benchmark

Add code
Jun 25, 2024
Figure 1 for Human-centered In-building Embodied Delivery Benchmark
Figure 2 for Human-centered In-building Embodied Delivery Benchmark
Figure 3 for Human-centered In-building Embodied Delivery Benchmark
Figure 4 for Human-centered In-building Embodied Delivery Benchmark
Viaarxiv icon

SpatialBot: Precise Spatial Understanding with Vision Language Models

Add code
Jun 19, 2024
Viaarxiv icon

AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation

Add code
Jun 17, 2024
Viaarxiv icon

GasTrace: Detecting Sandwich Attack Malicious Accounts in Ethereum

Add code
May 30, 2024
Viaarxiv icon

ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models

Add code
Mar 17, 2024
Figure 1 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 2 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 3 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 4 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Viaarxiv icon

NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation

Add code
Mar 13, 2024
Figure 1 for NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Figure 2 for NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Figure 3 for NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Figure 4 for NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Viaarxiv icon

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Add code
Dec 24, 2023
Figure 1 for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Figure 2 for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Figure 3 for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Figure 4 for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Viaarxiv icon

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Add code
Dec 21, 2023
Viaarxiv icon

ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection

Add code
Oct 13, 2023
Figure 1 for ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection
Figure 2 for ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection
Figure 3 for ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection
Figure 4 for ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection
Viaarxiv icon

RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation

Add code
Oct 05, 2023
Figure 1 for RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
Figure 2 for RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
Figure 3 for RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
Figure 4 for RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
Viaarxiv icon