Picture for Ming Hu

Ming Hu

FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement

Add code
Mar 20, 2026
Viaarxiv icon

Foundation-Model Surrogates Enable Data-Efficient Active Learning for Materials Discovery

Add code
Mar 17, 2026
Viaarxiv icon

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Add code
Mar 09, 2026
Viaarxiv icon

OPGAgent: An Agent for Auditable Dental Panoramic X-ray Interpretation

Add code
Feb 28, 2026
Viaarxiv icon

Training-Free Acceleration for Document Parsing Vision-Language Model with Hierarchical Speculative Decoding

Add code
Feb 13, 2026
Viaarxiv icon

A Vision-Language Foundation Model for Zero-shot Clinical Collaboration and Automated Concept Discovery in Dermatology

Add code
Feb 11, 2026
Viaarxiv icon

MedScope: Incentivizing "Think with Videos" for Clinical Reasoning via Coarse-to-Fine Tool Calling

Add code
Feb 11, 2026
Viaarxiv icon

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Add code
Feb 09, 2026
Viaarxiv icon

AdaptOVCD: Training-Free Open-Vocabulary Remote Sensing Change Detection via Adaptive Information Fusion

Add code
Feb 06, 2026
Viaarxiv icon

LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

Add code
Feb 03, 2026
Viaarxiv icon