Picture for Qi Jia

Qi Jia

VideoAesBench: Benchmarking the Video Aesthetics Perception Capabilities of Large Multimodal Models

Add code
Jan 29, 2026
Viaarxiv icon

Automated Safety Benchmarking: A Multi-agent Pipeline for LVLMs

Add code
Jan 27, 2026
Viaarxiv icon

Q-Bench-Portrait: Benchmarking Multimodal Large Language Models on Portrait Image Quality Perception

Add code
Jan 26, 2026
Viaarxiv icon

RSOD: Reliability-Guided Sonar Image Object Detection with Extremely Limited Labels

Add code
Jan 19, 2026
Viaarxiv icon

KidVis: Do Multimodal Large Language Models Possess the Visual Perceptual Capabilities of a 6-Year-Old?

Add code
Jan 13, 2026
Viaarxiv icon

EvolMem: A Cognitive-Driven Benchmark for Multi-Session Dialogue Memory

Add code
Jan 07, 2026
Viaarxiv icon

Generating Storytelling Images with Rich Chains-of-Reasoning

Add code
Dec 08, 2025
Viaarxiv icon

One Battle After Another: Probing LLMs' Limits on Multi-Turn Instruction Following with a Benchmark Evolving Framework

Add code
Nov 05, 2025
Viaarxiv icon

A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation

Add code
Sep 18, 2025
Viaarxiv icon

Can Large Models Fool the Eye? A New Turing Test for Biological Animation

Add code
Aug 08, 2025
Viaarxiv icon