Picture for Yingfei Liu

Yingfei Liu

SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation

Add code
Nov 12, 2025
Viaarxiv icon

MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation

Add code
Aug 26, 2025
Figure 1 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 2 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 3 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 4 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Viaarxiv icon

GeoVLA: Empowering 3D Representations in Vision-Language-Action Models

Add code
Aug 12, 2025
Figure 1 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 2 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 3 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 4 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Viaarxiv icon

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Add code
Jul 31, 2025
Figure 1 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 2 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 3 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 4 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Viaarxiv icon

Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding

Add code
Jun 05, 2025
Figure 1 for Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding
Figure 2 for Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding
Figure 3 for Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding
Figure 4 for Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding
Viaarxiv icon

PADriver: Towards Personalized Autonomous Driving

Add code
May 08, 2025
Viaarxiv icon

Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models

Add code
Nov 04, 2024
Figure 1 for Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models
Figure 2 for Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models
Figure 3 for Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models
Figure 4 for Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models
Viaarxiv icon

Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

Add code
May 28, 2024
Viaarxiv icon

The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

Add code
May 14, 2024
Figure 1 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 2 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 3 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 4 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Viaarxiv icon

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

Add code
Mar 28, 2024
Figure 1 for SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Figure 2 for SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Figure 3 for SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Figure 4 for SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Viaarxiv icon