Picture for Lin Ma

Lin Ma

RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case

Add code
Aug 06, 2025
Viaarxiv icon

X-SAM: From Segment Anything to Any Segmentation

Add code
Aug 06, 2025
Viaarxiv icon

Beyond the Visible: Benchmarking Occlusion Perception in Multimodal Large Language Models

Add code
Aug 06, 2025
Viaarxiv icon

DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios

Add code
Aug 01, 2025
Viaarxiv icon

Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation

Add code
Jun 23, 2025
Viaarxiv icon

ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Add code
Jun 18, 2025
Viaarxiv icon

Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learning

Add code
Jun 16, 2025
Viaarxiv icon

M4V: Multi-Modal Mamba for Text-to-Video Generation

Add code
Jun 12, 2025
Viaarxiv icon

DisTime: Distribution-based Time Representation for Video Large Language Models

Add code
May 30, 2025
Viaarxiv icon

Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput

Add code
May 14, 2025
Viaarxiv icon