Picture for Qing Li

Qing Li

Peng Cheng Laboratory

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design

Add code
Jun 09, 2025
Viaarxiv icon

From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes

Add code
Jun 05, 2025
Viaarxiv icon

When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways

Add code
May 30, 2025
Viaarxiv icon

Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models

Add code
May 29, 2025
Viaarxiv icon

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

Add code
May 26, 2025
Viaarxiv icon

Removal of Hallucination on Hallucination: Debate-Augmented RAG

Add code
May 24, 2025
Viaarxiv icon

EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Distributed Expectation Propagation for Multi-Object Tracking over Sensor Networks

Add code
May 24, 2025
Viaarxiv icon

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL

Add code
May 21, 2025
Viaarxiv icon

MGStream: Motion-aware 3D Gaussian for Streamable Dynamic Scene Reconstruction

Add code
May 20, 2025
Viaarxiv icon