Scene Understanding


MTPano: Multi-Task Panoramic Scene Understanding via Label-Free Integration of Dense Prediction Priors

Add code
Feb 05, 2026
Viaarxiv icon

IDSOR: Intensity- and Distance-Aware Statistical Outlier Removal for Weather-Robust LiDAR Point Clouds

Add code
Feb 05, 2026
Viaarxiv icon

TACO: Temporal Consensus Optimization for Continual Neural Mapping

Add code
Feb 05, 2026
Viaarxiv icon

CommCP: Efficient Multi-Agent Coordination via LLM-Based Communication with Conformal Prediction

Add code
Feb 05, 2026
Viaarxiv icon

Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning

Add code
Feb 05, 2026
Viaarxiv icon

Multimodal Latent Reasoning via Hierarchical Visual Cues Injection

Add code
Feb 05, 2026
Viaarxiv icon

UniSurg: A Video-Native Foundation Model for Universal Understanding of Surgical Videos

Add code
Feb 05, 2026
Viaarxiv icon

Relational Scene Graphs for Object Grounding of Natural Language Commands

Add code
Feb 04, 2026
Viaarxiv icon

GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning

Add code
Feb 04, 2026
Viaarxiv icon

Natural Language Instructions for Scene-Responsive Human-in-the-Loop Motion Planning in Autonomous Driving using Vision-Language-Action Models

Add code
Feb 04, 2026
Viaarxiv icon