Picture for Xiyu Ren

Xiyu Ren

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Add code
May 15, 2025
Viaarxiv icon

ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty

Add code
Dec 28, 2024
Viaarxiv icon