Picture for Qiuyu Kong

Qiuyu Kong

StruXLIP: Enhancing Vision-language Models with Multimodal Structural Cues

Add code
Feb 25, 2026
Viaarxiv icon