Picture for Jian Zhang

Jian Zhang

A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model

Add code
Apr 07, 2026
Viaarxiv icon

Interpretable Zero-shot Referring Expression Comprehension with Query-driven Scene Graphs

Add code
Mar 26, 2026
Viaarxiv icon

VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought

Add code
Mar 24, 2026
Viaarxiv icon

PivotRL: High Accuracy Agentic Post-Training at Low Compute Cost

Add code
Mar 22, 2026
Viaarxiv icon

Neuronal Self-Adaptation Enhances Capacity and Robustness of Representation in Spiking Neural Networks

Add code
Mar 21, 2026
Viaarxiv icon

Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement

Add code
Mar 20, 2026
Viaarxiv icon

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

Add code
Mar 19, 2026
Viaarxiv icon

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Add code
Mar 16, 2026
Viaarxiv icon

Thinking in Dynamics: How Multimodal Large Language Models Perceive, Track, and Reason Dynamics in Physical 4D World

Add code
Mar 13, 2026
Viaarxiv icon

OARS: Process-Aware Online Alignment for Generative Real-World Image Super-Resolution

Add code
Mar 13, 2026
Viaarxiv icon