Picture for Weijia Li

Weijia Li

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs

Add code
Oct 08, 2025
Viaarxiv icon

UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective

Add code
Sep 26, 2025
Figure 1 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 2 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 3 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 4 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Viaarxiv icon

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Add code
Aug 13, 2025
Viaarxiv icon

OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

Add code
May 29, 2025
Viaarxiv icon

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Add code
May 25, 2025
Viaarxiv icon

Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind

Add code
May 18, 2025
Viaarxiv icon

LAD-Reasoner: Tiny Multimodal Models are Good Reasoners for Logical Anomaly Detection

Add code
Apr 17, 2025
Viaarxiv icon

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Add code
Apr 03, 2025
Figure 1 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 2 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 3 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 4 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Viaarxiv icon

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Add code
Apr 01, 2025
Viaarxiv icon