Picture for Zongyuan Ge

Zongyuan Ge

RationalVLA: A Rational Vision-Language-Action Model with Dual System

Add code
Jun 12, 2025
Viaarxiv icon

APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Add code
Jun 09, 2025
Viaarxiv icon

Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery

Add code
May 23, 2025
Viaarxiv icon

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

Add code
May 22, 2025
Viaarxiv icon

MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment

Add code
May 14, 2025
Viaarxiv icon

Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model

Add code
May 13, 2025
Viaarxiv icon

Enhancing Fundus Image-based Glaucoma Screening via Dynamic Global-Local Feature Integration

Add code
Apr 01, 2025
Viaarxiv icon

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Add code
Mar 20, 2025
Viaarxiv icon

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Add code
Mar 19, 2025
Viaarxiv icon

MSWAL: 3D Multi-class Segmentation of Whole Abdominal Lesions Dataset

Add code
Mar 17, 2025
Viaarxiv icon