Picture for Hisham Cholakkal

Hisham Cholakkal

equal contribution

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

Add code
May 30, 2025
Viaarxiv icon

Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs

Add code
May 26, 2025
Figure 1 for Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Figure 2 for Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Figure 3 for Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Figure 4 for Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Viaarxiv icon

ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark

Add code
May 22, 2025
Figure 1 for ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Figure 2 for ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Figure 3 for ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Figure 4 for ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Viaarxiv icon

Open-Set Semi-Supervised Learning for Long-Tailed Medical Datasets

Add code
May 20, 2025
Viaarxiv icon

Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining

Add code
Apr 30, 2025
Figure 1 for Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining
Figure 2 for Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining
Figure 3 for Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining
Figure 4 for Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining
Viaarxiv icon

Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions

Add code
Mar 28, 2025
Figure 1 for Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
Figure 2 for Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
Figure 3 for Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
Figure 4 for Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
Viaarxiv icon

Tracking Meets Large Multimodal Models for Driving Scenario Understanding

Add code
Mar 18, 2025
Viaarxiv icon

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Add code
Mar 13, 2025
Figure 1 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 2 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 3 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Figure 4 for DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding
Viaarxiv icon

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Add code
Mar 06, 2025
Viaarxiv icon

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

Add code
Feb 28, 2025
Viaarxiv icon