Picture for Jiebo Luo

Jiebo Luo

Unleashing Hour-Scale Video Training for Long Video-Language Understanding

Add code
Jun 05, 2025
Viaarxiv icon

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Add code
May 28, 2025
Viaarxiv icon

Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese

Add code
May 28, 2025
Viaarxiv icon

MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models

Add code
May 26, 2025
Viaarxiv icon

On Path to Multimodal Generalist: General-Level and General-Bench

Add code
May 07, 2025
Viaarxiv icon

WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation

Add code
May 02, 2025
Viaarxiv icon

SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users

Add code
Apr 14, 2025
Viaarxiv icon

ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration

Add code
Apr 11, 2025
Viaarxiv icon

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Viaarxiv icon