Picture for Ming Hu

Ming Hu

TAGS: A Test-Time Generalist-Specialist Framework with Retrieval-Augmented Reasoning and Verification

Add code
May 23, 2025
Viaarxiv icon

Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery

Add code
May 23, 2025
Viaarxiv icon

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

Add code
May 22, 2025
Viaarxiv icon

RetinaLogos: Fine-Grained Synthesis of High-Resolution Retinal Images Through Captions

Add code
May 19, 2025
Viaarxiv icon

MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment

Add code
May 14, 2025
Viaarxiv icon

Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model

Add code
May 13, 2025
Viaarxiv icon

Rhythm of Opinion: A Hawkes-Graph Framework for Dynamic Propagation Analysis

Add code
Apr 21, 2025
Viaarxiv icon

GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning

Add code
Apr 02, 2025
Viaarxiv icon

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Add code
Mar 20, 2025
Viaarxiv icon

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Add code
Mar 19, 2025
Viaarxiv icon