Picture for Peng Gao

Peng Gao

DK

3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V

Add code
Dec 15, 2023
Viaarxiv icon

Digital Life Project: Autonomous 3D Characters with Social Intelligence

Add code
Dec 07, 2023
Figure 1 for Digital Life Project: Autonomous 3D Characters with Social Intelligence
Figure 2 for Digital Life Project: Autonomous 3D Characters with Social Intelligence
Figure 3 for Digital Life Project: Autonomous 3D Characters with Social Intelligence
Figure 4 for Digital Life Project: Autonomous 3D Characters with Social Intelligence
Viaarxiv icon

OneLLM: One Framework to Align All Modalities with Language

Add code
Dec 06, 2023
Figure 1 for OneLLM: One Framework to Align All Modalities with Language
Figure 2 for OneLLM: One Framework to Align All Modalities with Language
Figure 3 for OneLLM: One Framework to Align All Modalities with Language
Figure 4 for OneLLM: One Framework to Align All Modalities with Language
Viaarxiv icon

ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model

Add code
Nov 29, 2023
Figure 1 for ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model
Figure 2 for ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model
Figure 3 for ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model
Figure 4 for ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model
Viaarxiv icon

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Add code
Nov 13, 2023
Figure 1 for SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Figure 2 for SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Figure 3 for SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Figure 4 for SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Viaarxiv icon

Collaborative Decision-Making Using Spatiotemporal Graphs in Connected Autonomy

Add code
Oct 31, 2023
Figure 1 for Collaborative Decision-Making Using Spatiotemporal Graphs in Connected Autonomy
Figure 2 for Collaborative Decision-Making Using Spatiotemporal Graphs in Connected Autonomy
Figure 3 for Collaborative Decision-Making Using Spatiotemporal Graphs in Connected Autonomy
Figure 4 for Collaborative Decision-Making Using Spatiotemporal Graphs in Connected Autonomy
Viaarxiv icon

Improving Compositional Text-to-image Generation with Large Vision-Language Models

Add code
Oct 10, 2023
Figure 1 for Improving Compositional Text-to-image Generation with Large Vision-Language Models
Figure 2 for Improving Compositional Text-to-image Generation with Large Vision-Language Models
Figure 3 for Improving Compositional Text-to-image Generation with Large Vision-Language Models
Figure 4 for Improving Compositional Text-to-image Generation with Large Vision-Language Models
Viaarxiv icon

MTG: Mapless Trajectory Generator with Traversability Coverage for Outdoor Navigation

Add code
Sep 27, 2023
Figure 1 for MTG: Mapless Trajectory Generator with Traversability Coverage for Outdoor Navigation
Figure 2 for MTG: Mapless Trajectory Generator with Traversability Coverage for Outdoor Navigation
Figure 3 for MTG: Mapless Trajectory Generator with Traversability Coverage for Outdoor Navigation
Figure 4 for MTG: Mapless Trajectory Generator with Traversability Coverage for Outdoor Navigation
Viaarxiv icon

Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill

Add code
Sep 21, 2023
Figure 1 for Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill
Figure 2 for Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill
Figure 3 for Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill
Figure 4 for Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill
Viaarxiv icon

ImageBind-LLM: Multi-modality Instruction Tuning

Add code
Sep 11, 2023
Figure 1 for ImageBind-LLM: Multi-modality Instruction Tuning
Figure 2 for ImageBind-LLM: Multi-modality Instruction Tuning
Figure 3 for ImageBind-LLM: Multi-modality Instruction Tuning
Figure 4 for ImageBind-LLM: Multi-modality Instruction Tuning
Viaarxiv icon