Picture for Xuelong Li

Xuelong Li

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations

Add code
Jun 09, 2025
Viaarxiv icon

WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code

Add code
Jun 09, 2025
Viaarxiv icon

Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning

Add code
Jun 06, 2025
Viaarxiv icon

HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

Add code
Jun 04, 2025
Viaarxiv icon

Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction

Add code
May 30, 2025
Viaarxiv icon

Hume: Introducing System-2 Thinking in Visual-Language-Action Model

Add code
May 29, 2025
Viaarxiv icon

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Add code
May 29, 2025
Viaarxiv icon

Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation

Add code
May 27, 2025
Viaarxiv icon

Learn Beneficial Noise as Graph Augmentation

Add code
May 25, 2025
Viaarxiv icon

Dynamic Manipulation of Deformable Objects in 3D: Simulation, Benchmark and Learning Strategy

Add code
May 23, 2025
Viaarxiv icon