Picture for Yan Zhang

Yan Zhang

Fellow, IEEE

Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs

Add code
May 28, 2025
Viaarxiv icon

IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model

Add code
May 27, 2025
Viaarxiv icon

MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks

Add code
May 22, 2025
Viaarxiv icon

Confidence-Regulated Generative Diffusion Models for Reliable AI Agent Migration in Vehicular Metaverses

Add code
May 19, 2025
Viaarxiv icon

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Add code
May 07, 2025
Viaarxiv icon

Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting

Add code
May 01, 2025
Viaarxiv icon

SmallGS: Gaussian Splatting-based Camera Pose Estimation for Small-Baseline Videos

Add code
Apr 22, 2025
Viaarxiv icon

FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering

Add code
Apr 20, 2025
Viaarxiv icon

OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Add code
Apr 20, 2025
Figure 1 for OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding
Figure 2 for OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding
Figure 3 for OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding
Figure 4 for OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding
Viaarxiv icon

ProtPainter: Draw or Drag Protein via Topology-guided Diffusion

Add code
Apr 19, 2025
Viaarxiv icon