Picture for Jean Lahoud

Jean Lahoud

InceptionMamba: Efficient Multi-Stage Feature Enhancement with Selective State Space Model for Microscopic Medical Image Segmentation

Add code
Jun 13, 2025
Viaarxiv icon

A Culturally-diverse Multilingual Multimodal Video Benchmark & Model

Add code
Jun 08, 2025
Viaarxiv icon

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

Add code
May 30, 2025
Viaarxiv icon

Open-Set Semi-Supervised Learning for Long-Tailed Medical Datasets

Add code
May 20, 2025
Viaarxiv icon

Tracking Meets Large Multimodal Models for Driving Scenario Understanding

Add code
Mar 18, 2025
Viaarxiv icon

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Add code
Mar 13, 2025
Viaarxiv icon

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Add code
Mar 06, 2025
Viaarxiv icon

CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation

Add code
Feb 24, 2025
Viaarxiv icon

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Add code
Jan 10, 2025
Figure 1 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 2 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 3 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 4 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Viaarxiv icon

Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking

Add code
Oct 02, 2024
Viaarxiv icon