Picture for Yu-Gang Jiang

Yu-Gang Jiang

Fudan University

Attention Itself Could Retrieve.RetrieveVGGT: Training-Free Long Context Streaming 3D Reconstruction via Query-Key Similarity Retrieval

Add code
May 10, 2026
Viaarxiv icon

Spatiotemporal Sycophancy: Negation-Based Gaslighting in Video Large Language Models

Add code
Apr 20, 2026
Viaarxiv icon

ROSE: Retrieval-Oriented Segmentation Enhancement

Add code
Apr 15, 2026
Viaarxiv icon

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

Add code
Apr 14, 2026
Viaarxiv icon

CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation

Add code
Apr 10, 2026
Viaarxiv icon

AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly

Add code
Apr 10, 2026
Viaarxiv icon

Steering the Verifiability of Multimodal AI Hallucinations

Add code
Apr 08, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

PixelSmile: Toward Fine-Grained Facial Expression Editing

Add code
Mar 26, 2026
Viaarxiv icon

OCRA: Object-Centric Learning with 3D and Tactile Priors for Human-to-Robot Action Transfer

Add code
Mar 15, 2026
Viaarxiv icon