Picture for Xingyu Zhu

Xingyu Zhu

Temporal Evidence Routing with Structured Visual Evidence for TimeLogicQA

Add code
May 31, 2026
Viaarxiv icon

Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge

Add code
May 31, 2026
Viaarxiv icon

Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R

Add code
May 31, 2026
Viaarxiv icon

TCP-MCP: Landscape-Guided Co-Evolution of Prompts and Communication Topologies for Multi-Agent Systems

Add code
May 27, 2026
Viaarxiv icon

Adapting Point Cloud Analysis via Multimodal Bayesian Distribution Learning

Add code
Mar 23, 2026
Viaarxiv icon

Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models

Add code
Mar 23, 2026
Viaarxiv icon

MuSteerNet: Human Reaction Generation from Videos via Observation-Reaction Mutual Steering

Add code
Mar 20, 2026
Viaarxiv icon

Multimodal OCR: Parse Anything from Documents

Add code
Mar 13, 2026
Viaarxiv icon

GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models

Add code
Feb 27, 2026
Viaarxiv icon

Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation

Add code
Feb 27, 2026
Viaarxiv icon