Picture for Xingyu Zhu

Xingyu Zhu

Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge

Add code
May 31, 2026
Viaarxiv icon

Temporal Evidence Routing with Structured Visual Evidence for TimeLogicQA

Add code
May 31, 2026
Viaarxiv icon

Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R

Add code
May 31, 2026
Viaarxiv icon

TCP-MCP: Landscape-Guided Co-Evolution of Prompts and Communication Topologies for Multi-Agent Systems

Add code
May 27, 2026
Viaarxiv icon

Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models

Add code
Mar 23, 2026
Viaarxiv icon

Adapting Point Cloud Analysis via Multimodal Bayesian Distribution Learning

Add code
Mar 23, 2026
Viaarxiv icon

MuSteerNet: Human Reaction Generation from Videos via Observation-Reaction Mutual Steering

Add code
Mar 20, 2026
Viaarxiv icon

Multimodal OCR: Parse Anything from Documents

Add code
Mar 13, 2026
Viaarxiv icon

Thinking with Images as Continuous Actions: Numerical Visual Chain-of-Thought

Add code
Feb 27, 2026
Viaarxiv icon

GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models

Add code
Feb 27, 2026
Viaarxiv icon