Picture for Zhengxuan Zhang

Zhengxuan Zhang

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Add code
Mar 31, 2026
Viaarxiv icon

LatentGeo: Learnable Auxiliary Constructions in Latent Space for Multimodal Geometric Reasoning

Add code
Mar 12, 2026
Viaarxiv icon

DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering

Add code
Mar 12, 2026
Viaarxiv icon

Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection

Add code
Apr 24, 2025
Viaarxiv icon

DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify

Add code
Apr 14, 2025
Figure 1 for DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify
Figure 2 for DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify
Figure 3 for DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify
Figure 4 for DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify
Viaarxiv icon

Fine-Grained Retrieval-Augmented Generation for Visual Question Answering

Add code
Feb 28, 2025
Figure 1 for Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Figure 2 for Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Figure 3 for Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Figure 4 for Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Viaarxiv icon