Picture for Salman Khan

Salman Khan

CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare

Add code
Mar 25, 2026
Viaarxiv icon

WorldCache: Content-Aware Caching for Accelerated Video World Models

Add code
Mar 23, 2026
Viaarxiv icon

From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image Tampering

Add code
Mar 20, 2026
Viaarxiv icon

Latent-DARM: Bridging Discrete Diffusion And Autoregressive Models For Reasoning

Add code
Mar 10, 2026
Viaarxiv icon

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation

Add code
Mar 10, 2026
Viaarxiv icon

MediX-R1: Open Ended Medical Reinforcement Learning

Add code
Feb 26, 2026
Viaarxiv icon

Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device

Add code
Feb 24, 2026
Viaarxiv icon

OpenEarthAgent: A Unified Framework for Tool-Augmented Geospatial Agents

Add code
Feb 19, 2026
Viaarxiv icon

MedMO: Grounding and Understanding Multimodal Large Language Model for Medical Images

Add code
Feb 06, 2026
Viaarxiv icon

EoCD: Encoder only Remote Sensing Change Detection

Add code
Feb 05, 2026
Viaarxiv icon