Picture for Zhenyang Cai

Zhenyang Cai

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Add code
Sep 11, 2025
Viaarxiv icon

ShizhenGPT: Towards Multimodal LLMs for Traditional Chinese Medicine

Add code
Aug 20, 2025
Viaarxiv icon

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Add code
Dec 28, 2024
Figure 1 for On the Compositional Generalization of Multimodal LLMs for Medical Imaging
Figure 2 for On the Compositional Generalization of Multimodal LLMs for Medical Imaging
Figure 3 for On the Compositional Generalization of Multimodal LLMs for Medical Imaging
Figure 4 for On the Compositional Generalization of Multimodal LLMs for Medical Imaging
Viaarxiv icon

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Add code
Dec 25, 2024
Figure 1 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Figure 2 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Figure 3 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Figure 4 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Viaarxiv icon

Alignment at Pre-training! Towards Native Alignment for Arabic LLMs

Add code
Dec 04, 2024
Figure 1 for Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
Figure 2 for Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
Figure 3 for Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
Figure 4 for Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
Viaarxiv icon

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Add code
Jun 27, 2024
Figure 1 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 2 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 3 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 4 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Viaarxiv icon