Picture for Salman Khan

Salman Khan

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Add code
Dec 18, 2025
Viaarxiv icon

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

Add code
Dec 18, 2025
Viaarxiv icon

GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI

Add code
Nov 19, 2025
Figure 1 for GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Figure 2 for GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Figure 3 for GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Figure 4 for GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Viaarxiv icon

RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion

Add code
Oct 16, 2025
Viaarxiv icon

Dr.LLM: Dynamic Layer Routing in LLMs

Add code
Oct 14, 2025
Viaarxiv icon

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

Add code
Oct 09, 2025
Viaarxiv icon

How Good are Foundation Models in Step-by-Step Embodied Reasoning?

Add code
Sep 18, 2025
Figure 1 for How Good are Foundation Models in Step-by-Step Embodied Reasoning?
Figure 2 for How Good are Foundation Models in Step-by-Step Embodied Reasoning?
Figure 3 for How Good are Foundation Models in Step-by-Step Embodied Reasoning?
Figure 4 for How Good are Foundation Models in Step-by-Step Embodied Reasoning?
Viaarxiv icon

Distributed Deep Learning with RIS Grouping for Accurate Cascaded Channel Estimation

Add code
Sep 17, 2025
Viaarxiv icon

Promptception: How Sensitive Are Large Multimodal Models to Prompts?

Add code
Sep 04, 2025
Viaarxiv icon

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications

Add code
Aug 19, 2025
Viaarxiv icon