Picture for Rongrong Ji

Rongrong Ji

Xiamen University, Peng Cheng Laboratory

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence

Add code
Jun 09, 2025
Viaarxiv icon

Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective

Add code
May 28, 2025
Viaarxiv icon

Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs

Add code
May 28, 2025
Viaarxiv icon

What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation

Add code
May 26, 2025
Viaarxiv icon

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

Add code
May 23, 2025
Viaarxiv icon

Training Long-Context LLMs Efficiently via Chunk-wise Optimization

Add code
May 22, 2025
Viaarxiv icon

Speculative Decoding Reimagined for Multimodal Large Language Models

Add code
May 20, 2025
Viaarxiv icon

Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation

Add code
May 16, 2025
Viaarxiv icon

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Add code
May 06, 2025
Viaarxiv icon

Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection

Add code
Apr 29, 2025
Viaarxiv icon