LLaVA


LLaVA (Low Light Video Analysis) is a dataset and benchmark for low light video analysis tasks.

PIO-FVLM: Rethinking Training-Free Visual Token Reduction for VLM Acceleration from an Inference-Objective Perspective

Add code
Feb 05, 2026
Viaarxiv icon

Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon

When LLaVA Meets Objects: Token Composition for Vision-Language-Models

Add code
Feb 04, 2026
Viaarxiv icon

Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning

Add code
Feb 03, 2026
Viaarxiv icon

ConsensusDrop: Fusing Visual and Cross-Modal Saliency for Efficient Vision Language Models

Add code
Feb 01, 2026
Viaarxiv icon

SAGE: Accelerating Vision-Language Models via Entropy-Guided Adaptive Speculative Decoding

Add code
Jan 31, 2026
Viaarxiv icon

Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework

Add code
Jan 30, 2026
Viaarxiv icon

Gated Relational Alignment via Confidence-based Distillation for Efficient VLMs

Add code
Jan 30, 2026
Viaarxiv icon

LLaVA-FA: Learning Fourier Approximation for Compressing Large Multimodal Models

Add code
Jan 28, 2026
Viaarxiv icon

IC-EO: Interpretable Code-based assistant for Earth Observation

Add code
Jan 27, 2026
Viaarxiv icon