Picture for Serena Yeung-Levy

Serena Yeung-Levy

NegVQA: Can Vision Language Models Understand Negation?

Add code
May 28, 2025
Viaarxiv icon

Can Large Language Models Match the Conclusions of Systematic Reviews?

Add code
May 28, 2025
Viaarxiv icon

Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence

Add code
Apr 03, 2025
Viaarxiv icon

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Add code
Mar 17, 2025
Viaarxiv icon

Video Action Differencing

Add code
Mar 10, 2025
Viaarxiv icon

SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection

Add code
Mar 05, 2025
Viaarxiv icon

Temporal Preference Optimization for Long-Form Video Understanding

Add code
Jan 23, 2025
Figure 1 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 2 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 3 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 4 for Temporal Preference Optimization for Long-Form Video Understanding
Viaarxiv icon

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Add code
Jan 14, 2025
Viaarxiv icon

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Add code
Jan 06, 2025
Figure 1 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 2 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 3 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 4 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Viaarxiv icon

Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration

Add code
Dec 17, 2024
Viaarxiv icon