Picture for Zhenyue Qin

Zhenyue Qin

Bob

VOLMO: Versatile and Open Large Models for Ophthalmology

Add code
Mar 25, 2026
Viaarxiv icon

Mind the Rarities: Can Rare Skin Diseases Be Reliably Diagnosed via Diagnostic Reasoning?

Add code
Mar 19, 2026
Viaarxiv icon

Beyond Medical Diagnostics: How Medical Multimodal Large Language Models Think in Space

Add code
Mar 14, 2026
Viaarxiv icon

A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine

Add code
Jan 29, 2026
Viaarxiv icon

Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey

Add code
May 20, 2025
Figure 1 for Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
Figure 2 for Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
Figure 3 for Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
Figure 4 for Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
Viaarxiv icon

GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder

Add code
Feb 17, 2025
Viaarxiv icon

HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images

Add code
Nov 07, 2024
Figure 1 for HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images
Figure 2 for HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images
Figure 3 for HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images
Figure 4 for HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images
Viaarxiv icon

Visual Prompting in LLMs for Enhancing Emotion Recognition

Add code
Oct 03, 2024
Viaarxiv icon

LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models

Add code
Oct 02, 2024
Figure 1 for LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Figure 2 for LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Figure 3 for LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Figure 4 for LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Viaarxiv icon

Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News

Add code
Apr 21, 2024
Viaarxiv icon