Picture for Dilxat Muhtar

Dilxat Muhtar

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure

Add code
Dec 27, 2025
Viaarxiv icon

FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding

Add code
Nov 18, 2025
Figure 1 for FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding
Figure 2 for FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding
Figure 3 for FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding
Figure 4 for FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding
Viaarxiv icon

Diffusion Language Models Know the Answer Before Decoding

Add code
Aug 27, 2025
Figure 1 for Diffusion Language Models Know the Answer Before Decoding
Figure 2 for Diffusion Language Models Know the Answer Before Decoding
Figure 3 for Diffusion Language Models Know the Answer Before Decoding
Figure 4 for Diffusion Language Models Know the Answer Before Decoding
Viaarxiv icon

StreamAdapter: Efficient Test Time Adaptation from Contextual Streams

Add code
Nov 14, 2024
Figure 1 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 2 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 3 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 4 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Viaarxiv icon

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation

Add code
Nov 14, 2024
Figure 1 for LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
Figure 2 for LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
Figure 3 for LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
Figure 4 for LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
Viaarxiv icon

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Add code
Oct 15, 2024
Figure 1 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 2 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 3 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 4 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Viaarxiv icon

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

Add code
Feb 07, 2024
Figure 1 for LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Figure 2 for LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Figure 3 for LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Figure 4 for LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Viaarxiv icon

CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding

Add code
Apr 19, 2023
Figure 1 for CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding
Figure 2 for CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding
Figure 3 for CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding
Figure 4 for CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding
Viaarxiv icon