Picture for Yuhui Zhang

Yuhui Zhang

PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR

Add code
Jan 26, 2026
Viaarxiv icon

RadDiff: Describing Differences in Radiology Image Sets with Natural Language

Add code
Jan 07, 2026
Viaarxiv icon

Intelligent recognition of GPR road hidden defect images based on feature fusion and attention mechanism

Add code
Dec 25, 2025
Viaarxiv icon

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

Add code
Dec 24, 2025
Figure 1 for Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning
Figure 2 for Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning
Figure 3 for Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning
Figure 4 for Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning
Viaarxiv icon

Lightweight framework for underground pipeline recognition and spatial localization based on multi-view 2D GPR images

Add code
Dec 24, 2025
Viaarxiv icon

AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing

Add code
Jun 16, 2025
Figure 1 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 2 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 3 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Figure 4 for AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Viaarxiv icon

Can Large Language Models Match the Conclusions of Systematic Reviews?

Add code
May 28, 2025
Viaarxiv icon

NegVQA: Can Vision Language Models Understand Negation?

Add code
May 28, 2025
Viaarxiv icon

TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation

Add code
May 22, 2025
Viaarxiv icon

A 2D Semantic-Aware Position Encoding for Vision Transformers

Add code
May 14, 2025
Figure 1 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 2 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 3 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Figure 4 for A 2D Semantic-Aware Position Encoding for Vision Transformers
Viaarxiv icon