Picture for Salman Khan

Salman Khan

Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology

Add code
Mar 13, 2025
Figure 1 for Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
Figure 2 for Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
Figure 3 for Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
Figure 4 for Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
Viaarxiv icon

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Add code
Mar 13, 2025
Figure 1 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 2 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 3 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 4 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Viaarxiv icon

Handwritten Digit Recognition: An Ensemble-Based Approach for Superior Performance

Add code
Mar 08, 2025
Viaarxiv icon

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Add code
Mar 06, 2025
Viaarxiv icon

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

Add code
Feb 28, 2025
Viaarxiv icon

C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation

Add code
Feb 27, 2025
Figure 1 for C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
Figure 2 for C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
Figure 3 for C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
Figure 4 for C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
Viaarxiv icon

AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment

Add code
Feb 25, 2025
Figure 1 for AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment
Figure 2 for AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment
Figure 3 for AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment
Figure 4 for AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment
Viaarxiv icon

CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation

Add code
Feb 24, 2025
Viaarxiv icon

KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

Add code
Feb 20, 2025
Figure 1 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 2 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 3 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 4 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Viaarxiv icon

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts

Add code
Feb 20, 2025
Viaarxiv icon