Picture for Limin Wang

Limin Wang

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

MobileViCLIP: An Efficient Video-Text Model for Mobile Devices

Add code
Aug 10, 2025
Viaarxiv icon

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Add code
Jun 12, 2025
Viaarxiv icon

VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning

Add code
Jun 06, 2025
Viaarxiv icon

SORCE: Small Object Retrieval in Complex Environments

Add code
May 30, 2025
Viaarxiv icon

Differentiable Solver Search for Fast Diffusion Sampling

Add code
May 27, 2025
Viaarxiv icon

CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning

Add code
May 22, 2025
Viaarxiv icon

Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining

Add code
May 10, 2025
Viaarxiv icon

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Add code
Apr 21, 2025
Viaarxiv icon

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

Add code
Apr 16, 2025
Viaarxiv icon