Picture for Deli Zhao

Deli Zhao

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Add code
Jun 08, 2025
Viaarxiv icon

EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?

Add code
Jun 05, 2025
Viaarxiv icon

STAR-R1: Spatial TrAnsformation Reasoning by Reinforcing Multimodal LLMs

Add code
May 26, 2025
Viaarxiv icon

STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs

Add code
May 21, 2025
Viaarxiv icon

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency

Add code
Apr 29, 2025
Viaarxiv icon

FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving

Add code
Feb 27, 2025
Viaarxiv icon

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Add code
Feb 27, 2025
Viaarxiv icon

A Survey of Graph Transformers: Architectures, Theories and Applications

Add code
Feb 23, 2025
Figure 1 for A Survey of Graph Transformers: Architectures, Theories and Applications
Figure 2 for A Survey of Graph Transformers: Architectures, Theories and Applications
Figure 3 for A Survey of Graph Transformers: Architectures, Theories and Applications
Figure 4 for A Survey of Graph Transformers: Architectures, Theories and Applications
Viaarxiv icon

MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

Add code
Feb 22, 2025
Viaarxiv icon

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Add code
Jan 08, 2025
Viaarxiv icon