Picture for Yuhui Zhang

Yuhui Zhang

Intelligent recognition of GPR road hidden defect images based on feature fusion and attention mechanism

Add code
Dec 25, 2025
Viaarxiv icon

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

Add code
Dec 24, 2025
Viaarxiv icon

Lightweight framework for underground pipeline recognition and spatial localization based on multi-view 2D GPR images

Add code
Dec 24, 2025
Viaarxiv icon

AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing

Add code
Jun 16, 2025
Figure 1 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 2 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 3 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 4 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Viaarxiv icon

NegVQA: Can Vision Language Models Understand Negation?

Add code
May 28, 2025
Viaarxiv icon

Can Large Language Models Match the Conclusions of Systematic Reviews?

Add code
May 28, 2025
Viaarxiv icon

TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation

Add code
May 22, 2025
Viaarxiv icon

A 2D Semantic-Aware Position Encoding for Vision Transformers

Add code
May 14, 2025
Figure 1 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 2 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 3 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 4 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Viaarxiv icon

Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity

Add code
May 12, 2025
Figure 1 for Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Figure 2 for Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Figure 3 for Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Figure 4 for Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Viaarxiv icon

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Add code
Mar 17, 2025
Viaarxiv icon