Picture for Sihao Lin

Sihao Lin

Ask When It Pays: Cost-Aware Open-Ended Interaction for Instance Goal Navigation

Add code
Jun 03, 2026
Viaarxiv icon

IntentionNav: A Benchmark for Intent-Driven Object Navigation from Implicit Human Instruction

Add code
May 22, 2026
Viaarxiv icon

One Agent to Guide Them All: Empowering MLLMs for Vision-and-Language Navigation via Explicit World Representation

Add code
Feb 17, 2026
Viaarxiv icon

VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation

Add code
Dec 22, 2025
Figure 1 for VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation
Figure 2 for VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation
Figure 3 for VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation
Figure 4 for VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation
Viaarxiv icon

Decoupled Action Head: Confining Task Knowledge to Conditioning Layers

Add code
Nov 15, 2025
Viaarxiv icon

Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation

Add code
Mar 10, 2025
Figure 1 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 2 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 3 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 4 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Viaarxiv icon

TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba

Add code
Feb 21, 2025
Figure 1 for TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
Figure 2 for TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
Figure 3 for TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
Figure 4 for TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
Viaarxiv icon

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Add code
Aug 25, 2024
Viaarxiv icon

MLP Can Be A Good Transformer Learner

Add code
Apr 08, 2024
Viaarxiv icon

Self-Supervised Multi-Frame Neural Scene Flow

Add code
Mar 24, 2024
Figure 1 for Self-Supervised Multi-Frame Neural Scene Flow
Figure 2 for Self-Supervised Multi-Frame Neural Scene Flow
Figure 3 for Self-Supervised Multi-Frame Neural Scene Flow
Figure 4 for Self-Supervised Multi-Frame Neural Scene Flow
Viaarxiv icon