Picture for Ankan Deria

Ankan Deria

See Fair, Speak Truth: Equitable Attention Improves Grounding and Reduces Hallucination in Vision-Language Alignment

Add code
Apr 10, 2026
Viaarxiv icon

CoME-VL: Scaling Complementary Multi-Encoder Vision-Language Learning

Add code
Apr 03, 2026
Viaarxiv icon

MedMO: Grounding and Understanding Multimodal Large Language Model for Medical Images

Add code
Feb 06, 2026
Viaarxiv icon

Robust Atypical Mitosis Classification with DenseNet121: Stain-Aware Augmentation and Hybrid Loss for Domain Generalization

Add code
Oct 26, 2025
Viaarxiv icon

MuGa-VTON: Multi-Garment Virtual Try-On via Diffusion Transformers with Prompt Customization

Add code
Aug 11, 2025
Viaarxiv icon

Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning

Add code
Jun 18, 2025
Figure 1 for Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Figure 2 for Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Figure 3 for Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Figure 4 for Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Viaarxiv icon