Scene Understanding


Estimating Commonsense Scene Composition on Belief Scene Graphs

Add code
May 05, 2025
Viaarxiv icon

Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation

Add code
May 04, 2025
Viaarxiv icon

Segment Any RGB-Thermal Model with Language-aided Distillation

Add code
May 04, 2025
Viaarxiv icon

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

Add code
May 04, 2025
Viaarxiv icon

A UNet Model for Accelerated Preprocessing of CRISM Hyperspectral Data for Mineral Identification on Mars

Add code
May 04, 2025
Viaarxiv icon

Embracing Diffraction: A Paradigm Shift in Wireless Sensing and Communication

Add code
May 02, 2025
Viaarxiv icon

PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth

Add code
May 03, 2025
Viaarxiv icon

FedEMA: Federated Exponential Moving Averaging with Negative Entropy Regularizer in Autonomous Driving

Add code
May 01, 2025
Viaarxiv icon

iMacSR: Intermediate Multi-Access Supervision and Regularization in Training Autonomous Driving Models

Add code
May 01, 2025
Viaarxiv icon

V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving

Add code
Apr 30, 2025
Viaarxiv icon