Picture for Xun Zhu

Xun Zhu

ReportQA: QA-Based Radiology Report Evaluation

Add code
Jun 13, 2026
Viaarxiv icon

The 1st PortraitCraft Challenge: A CVPR 2026 Workshop Competition on Portrait Composition Understanding and Generation

Add code
Jun 09, 2026
Viaarxiv icon

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

Add code
May 29, 2026
Viaarxiv icon

InterMesh: Explicit Interaction-Aware End-to-End Multi-Person Human Mesh Recovery

Add code
May 06, 2026
Viaarxiv icon

Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification

Add code
Apr 09, 2026
Viaarxiv icon

MIPS: a Multimodal Infinite Polymer Sequence Pre-training Framework for Polymer Property Prediction

Add code
Jul 27, 2025
Viaarxiv icon

Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data

Add code
Apr 14, 2025
Figure 1 for Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Figure 2 for Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Figure 3 for Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Figure 4 for Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Viaarxiv icon

MedM-VL: What Makes a Good Medical LVLM?

Add code
Apr 06, 2025
Viaarxiv icon

Connector-S: A Survey of Connectors in Multi-modal Large Language Models

Add code
Feb 17, 2025
Figure 1 for Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Figure 2 for Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Figure 3 for Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Figure 4 for Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Viaarxiv icon

Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model

Add code
Nov 19, 2024
Figure 1 for Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
Figure 2 for Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
Figure 3 for Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
Figure 4 for Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
Viaarxiv icon