Picture for Yiming Zhang

Yiming Zhang

Jake

DINORANKCLIP: DINOv3 Distillation and Injection for Vision-Language Pretraining with High-Order Ranking Consistency

Add code
May 07, 2026
Viaarxiv icon

Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination

Add code
Apr 27, 2026
Viaarxiv icon

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Add code
Apr 27, 2026
Viaarxiv icon

GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models

Add code
Apr 27, 2026
Viaarxiv icon

ProVG: Progressive Visual Grounding via Language Decoupling for Remote Sensing Imagery

Add code
Apr 02, 2026
Viaarxiv icon

Video2LoRA: Unified Semantic-Controlled Video Generation via Per-Reference-Video LoRA

Add code
Mar 10, 2026
Viaarxiv icon

Physics-infused Learning for Aerial Manipulator in Winds and Near-Wall Environments

Add code
Mar 08, 2026
Viaarxiv icon

RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering

Add code
Feb 19, 2026
Viaarxiv icon

GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Add code
Feb 13, 2026
Viaarxiv icon

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment

Add code
Feb 10, 2026
Viaarxiv icon