Picture for Rongtao Xu

Rongtao Xu

Image Recognition with Online Lightweight Vision Transformer: A Survey

Add code
May 06, 2025
Viaarxiv icon

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Add code
May 03, 2025
Viaarxiv icon

CAE-DFKD: Bridging the Transferability Gap in Data-Free Knowledge Distillation

Add code
Apr 30, 2025
Viaarxiv icon

A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Add code
Apr 21, 2025
Viaarxiv icon

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Add code
Apr 14, 2025
Viaarxiv icon

Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision

Add code
Apr 03, 2025
Viaarxiv icon

Structured Preference Optimization for Vision-Language Long-Horizon Task Planning

Add code
Feb 28, 2025
Viaarxiv icon

Constraint-Aware Zero-Shot Vision-Language Navigation in Continuous Environments

Add code
Dec 13, 2024
Viaarxiv icon

InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction

Add code
Dec 08, 2024
Viaarxiv icon

InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models

Add code
Nov 18, 2024
Figure 1 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 2 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 3 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 4 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Viaarxiv icon