Picture for Zhen Li

Zhen Li

LMO, CELESTE, HEC Paris

Deep learning for 3D point cloud processing -- from approaches, tasks to its implications on urban and environmental applications

Add code
Sep 15, 2025
Viaarxiv icon

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning

Add code
Aug 01, 2025
Figure 1 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 2 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 3 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 4 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Viaarxiv icon

T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval

Add code
Jul 28, 2025
Figure 1 for T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval
Figure 2 for T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval
Figure 3 for T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval
Figure 4 for T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval
Viaarxiv icon

Yume: An Interactive World Generation Model

Add code
Jul 23, 2025
Figure 1 for Yume: An Interactive World Generation Model
Figure 2 for Yume: An Interactive World Generation Model
Figure 3 for Yume: An Interactive World Generation Model
Figure 4 for Yume: An Interactive World Generation Model
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Figure 1 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 2 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 3 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 4 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Viaarxiv icon

Bradley-Terry and Multi-Objective Reward Modeling Are Complementary

Add code
Jul 10, 2025
Viaarxiv icon

SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments

Add code
Jul 09, 2025
Figure 1 for SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments
Figure 2 for SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments
Figure 3 for SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments
Figure 4 for SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments
Viaarxiv icon

Sekai: A Video Dataset towards World Exploration

Add code
Jun 18, 2025
Viaarxiv icon

RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning

Add code
Jun 16, 2025
Figure 1 for RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
Figure 2 for RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
Figure 3 for RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
Figure 4 for RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
Viaarxiv icon