Picture for Zhiyuan Xu

Zhiyuan Xu

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

Add code
Jun 28, 2024
Figure 1 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 2 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 3 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 4 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Viaarxiv icon

FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation

Add code
Mar 28, 2024
Figure 1 for FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation
Figure 2 for FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation
Figure 3 for FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation
Figure 4 for FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation
Viaarxiv icon

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models

Add code
Mar 15, 2024
Figure 1 for Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Figure 2 for Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Figure 3 for Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Figure 4 for Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Viaarxiv icon

A Survey on Robotics with Foundation Models: toward Embodied AI

Add code
Feb 04, 2024
Figure 1 for A Survey on Robotics with Foundation Models: toward Embodied AI
Figure 2 for A Survey on Robotics with Foundation Models: toward Embodied AI
Viaarxiv icon

Language-Conditioned Robotic Manipulation with Fast and Slow Thinking

Add code
Feb 01, 2024
Viaarxiv icon

Visual Robotic Manipulation with Depth-Aware Pretraining

Add code
Jan 17, 2024
Viaarxiv icon

An Efficient Generalizable Framework for Visuomotor Policies via Control-aware Augmentation and Privilege-guided Distillation

Add code
Jan 17, 2024
Viaarxiv icon

SWBT: Similarity Weighted Behavior Transformer with the Imperfect Demonstration for Robotic Manipulation

Add code
Jan 17, 2024
Viaarxiv icon

Object-Centric Instruction Augmentation for Robotic Manipulation

Add code
Jan 05, 2024
Viaarxiv icon

Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering

Add code
Dec 20, 2023
Figure 1 for Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Figure 2 for Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Figure 3 for Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Figure 4 for Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Viaarxiv icon