Picture for Zhen Li

Zhen Li

LMO, CELESTE, HEC Paris

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning

Add code
Aug 01, 2025
Viaarxiv icon

T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval

Add code
Jul 28, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Viaarxiv icon

Yume: An Interactive World Generation Model

Add code
Jul 23, 2025
Viaarxiv icon

Bradley-Terry and Multi-Objective Reward Modeling Are Complementary

Add code
Jul 10, 2025
Viaarxiv icon

SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments

Add code
Jul 09, 2025
Viaarxiv icon

Sekai: A Video Dataset towards World Exploration

Add code
Jun 18, 2025
Viaarxiv icon

RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning

Add code
Jun 16, 2025
Viaarxiv icon

MiniCPM4: Ultra-Efficient LLMs on End Devices

Add code
Jun 09, 2025
Viaarxiv icon