Picture for Yuxin Peng

Yuxin Peng

DiCoBench: Benchmarking Multi-Image Fine-Grained Perception via Differential and Commonality Visual Cues

Add code
Jun 25, 2026
Viaarxiv icon

Taming I2V models for Image HOI Editing: A Cognitive Benchmark and Agentic Self-Correcting Framework

Add code
Jun 17, 2026
Viaarxiv icon

Temporal-Aware Reasoning Optimization for Video Temporal Grounding

Add code
Jun 08, 2026
Viaarxiv icon

AesFormer: Transform Everyday Photos into Beautiful Memories

Add code
May 21, 2026
Viaarxiv icon

Beyond Binary Success: A Diagnostic Meta-Evaluation Framework for Fine-Grained Manipulation

Add code
May 19, 2026
Viaarxiv icon

FIKA-Bench: From Fine-grained Recognition to Fine-Grained Knowledge Acquisition

Add code
May 13, 2026
Viaarxiv icon

BadmintonGRF: A Multimodal Dataset and Benchmark for Markerless Ground Reaction Force Estimation in Badminton

Add code
May 03, 2026
Viaarxiv icon

OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Grounding

Add code
Apr 28, 2026
Viaarxiv icon

Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models

Add code
Feb 28, 2026
Viaarxiv icon

Venus: Benchmarking and Empowering Multimodal Large Language Models for Aesthetic Guidance and Cropping

Add code
Feb 27, 2026
Viaarxiv icon