Picture for Lewei Lu

Lewei Lu

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Add code
Oct 16, 2025
Viaarxiv icon

Spatial Preference Rewarding for MLLMs Spatial Understanding

Add code
Oct 16, 2025
Viaarxiv icon

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

Add code
Oct 09, 2025
Viaarxiv icon

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding

Add code
Aug 29, 2025
Viaarxiv icon

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Add code
Aug 18, 2025
Viaarxiv icon

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

Add code
Jun 09, 2025
Viaarxiv icon

Streamline Without Sacrifice -- Squeeze out Computation Redundancy in LMM

Add code
May 21, 2025
Viaarxiv icon

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Add code
Apr 21, 2025
Figure 1 for VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models
Figure 2 for VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models
Figure 3 for VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models
Figure 4 for VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models
Viaarxiv icon

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Add code
Apr 15, 2025
Viaarxiv icon

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Add code
Mar 13, 2025
Figure 1 for VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Figure 2 for VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Figure 3 for VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Figure 4 for VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Viaarxiv icon