Picture for Tingting Gao

Tingting Gao

ROVER: Routing Object-Centric Visual Evidence for Grounded Multi-Image Reasoning

Add code
May 28, 2026
Viaarxiv icon

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

Add code
May 27, 2026
Viaarxiv icon

MetaphorVU: Towards Metaphorical Video Understanding

Add code
May 25, 2026
Viaarxiv icon

CaC: Advancing Video Reward Models via Hierarchical Spatiotemporal Concentrating

Add code
May 12, 2026
Viaarxiv icon

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

Add code
Apr 09, 2026
Viaarxiv icon

OmniDiT: Extending Diffusion Transformer to Omni-VTON Framework

Add code
Mar 20, 2026
Viaarxiv icon

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

Add code
Mar 01, 2026
Viaarxiv icon

ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL

Add code
Feb 26, 2026
Viaarxiv icon

CREM: Compression-Driven Representation Enhancement for Multimodal Retrieval and Comprehension

Add code
Feb 22, 2026
Viaarxiv icon

UniRef-Image-Edit: Towards Scalable and Consistent Multi-Reference Image Editing

Add code
Feb 15, 2026
Viaarxiv icon