Picture for Lin Ma

Lin Ma

Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation

Add code
Jun 23, 2025
Viaarxiv icon

ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Add code
Jun 18, 2025
Viaarxiv icon

Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learning

Add code
Jun 16, 2025
Viaarxiv icon

M4V: Multi-Modal Mamba for Text-to-Video Generation

Add code
Jun 12, 2025
Viaarxiv icon

DisTime: Distribution-based Time Representation for Video Large Language Models

Add code
May 30, 2025
Viaarxiv icon

Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput

Add code
May 14, 2025
Viaarxiv icon

TopoDiT-3D: Topology-Aware Diffusion Transformer with Bottleneck Structure for 3D Point Cloud Generation

Add code
May 14, 2025
Viaarxiv icon

ScaleTrack: Scaling and back-tracking Automated GUI Agents

Add code
May 01, 2025
Viaarxiv icon

MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints

Add code
Apr 12, 2025
Viaarxiv icon

InstructionBench: An Instructional Video Understanding Benchmark

Add code
Apr 07, 2025
Viaarxiv icon