Picture for Qizhi Chen

Qizhi Chen

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Add code
Sep 09, 2025
Viaarxiv icon

EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Add code
Aug 28, 2025
Viaarxiv icon

GraphCogent: Overcoming LLMs' Working Memory Constraints via Multi-Agent Collaboration in Complex Graph Understanding

Add code
Aug 17, 2025
Viaarxiv icon

Hume: Introducing System-2 Thinking in Visual-Language-Action Model

Add code
May 29, 2025
Viaarxiv icon

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Add code
Apr 10, 2025
Viaarxiv icon

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Add code
Apr 01, 2025
Viaarxiv icon

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Add code
Feb 26, 2025
Figure 1 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 2 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 3 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 4 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Figure 1 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 2 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 3 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 4 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Viaarxiv icon

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Add code
Jan 27, 2025
Figure 1 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 2 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 3 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 4 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Viaarxiv icon