Picture for Yuhui Zhang

Yuhui Zhang

Can Large Language Models Match the Conclusions of Systematic Reviews?

Add code
May 28, 2025
Viaarxiv icon

NegVQA: Can Vision Language Models Understand Negation?

Add code
May 28, 2025
Viaarxiv icon

TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation

Add code
May 22, 2025
Viaarxiv icon

A 2D Semantic-Aware Position Encoding for Vision Transformers

Add code
May 14, 2025
Viaarxiv icon

Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity

Add code
May 12, 2025
Viaarxiv icon

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Add code
Mar 17, 2025
Viaarxiv icon

Video Action Differencing

Add code
Mar 10, 2025
Viaarxiv icon

EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

Add code
Feb 18, 2025
Viaarxiv icon

Temporal Preference Optimization for Long-Form Video Understanding

Add code
Jan 23, 2025
Figure 1 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 2 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 3 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 4 for Temporal Preference Optimization for Long-Form Video Understanding
Viaarxiv icon

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Add code
Jan 14, 2025
Viaarxiv icon