Picture for Fahad Shahbaz Khan

Fahad Shahbaz Khan

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation

Add code
Aug 12, 2025
Viaarxiv icon

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Add code
Jul 31, 2025
Viaarxiv icon

AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock

Add code
Jul 29, 2025
Viaarxiv icon

TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models

Add code
Jun 13, 2025
Viaarxiv icon

A Culturally-diverse Multilingual Multimodal Video Benchmark & Model

Add code
Jun 08, 2025
Viaarxiv icon

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

Add code
Jun 06, 2025
Viaarxiv icon

VideoMolmo: Spatio-Temporal Grounding Meets Pointing

Add code
Jun 05, 2025
Viaarxiv icon

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

Add code
May 30, 2025
Viaarxiv icon

ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

Add code
May 29, 2025
Viaarxiv icon

One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

Add code
May 28, 2025
Viaarxiv icon